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ABSTRACT 



A method and apparatus are disclosed for optimizing the 
runtime behavior of database or other applications by allow- 
ing selection of alternative code segments during linking of 
pre-compiled object modules. A macro-preprocessor inserts 
a declaration for a global variable in the source code in 
response to an occurrence of a command of interest. The 
linker selects object modules for executing other commands 
based on the presence or absence of the declaration for the 
global variable in the preprocessed source code. The method 
and apparatus are useful in implementing programming 
language statements including non-procedural programming 
languages such as the Embedded Structured Query Lan- 
guage (ESQL). 
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LINKTIME RECOGNITION OF ALTERNATIVE 
IMPLEMENTATIONS OF PROGRAMMED 
FUNCTIONALITY 

TECHNICAL FIELD OF THE INVENTION 

[0001] This invention relates generally to generating an 
efficient executable corresponding to a program written in a 
higher-level computer programming language, and more 
particularly to use indicators that change the executable code 
to take into account the context in implementing a particular 
command. 

BACKGROUND OF THE INVENTION 

[0002] The advent of databases and e -commerce requires 
the ability to request services from a variety of databases 
without knowing the exact implementation of the database 
or of the statements used to request the services. These 
request statements are made in a non-procedural program- 
ming language that does not provide an explicit implemen- 
tation. Instead, the developers of particular databases or 
non-procedural programming language statements provide 
proprietary implementations for the statements rendered in 
the non-procedural language. 

[0003] Structured Query Language ("SQL") is an illustra- 
tive example of a non-procedural language. SQL differs 
from a procedural language like FORTRAN in that it does 
not specify how a particular request is carried out, but 
instead allows the database manager to provide the relevant 
details. Thus, a command in SQL merely states a request and 
not how it is carried out. 

[0004] SQL includes: a Data Development Language 
("DDL") for creating databases and data structures, but not 
necessarily data itself; a Data Manipulation Language 
("DML") facilitating database maintenance and actual 
operations on data; and a Data Control Language ("DCL") 
for specifying security requirements. Some examples of 
SQL commands include the DDL commands CREATE, 
ALTER and DROP, DML statements and functions such as 
INSERT, UPDATE, DELETE, SELECT, COUNT, SUM and 
the like, and DCL commands such as COMMIT, ROLL- 
BACK, GRANT and REVOKE. 

[0005] SQL permits interactions with a database in an 
atomic manner, i.e. only one user may access a unit of data, 
to prevent other users from changing the database between 
operations constituting a transaction. The code used to 
implement these commands and functions is the responsi- 
bility of the database developer or vendor. Of course, 
universal support for SQL commands ensures that any user 
can access and use a SQL compliant database regardless of 
the database vendor and particular implementation details. 

[0006] SQL commands such as COMMIT and ROLL- 
BACK are of interest in an exemplary embodiment of the 
invention. These SQL commands protect a database against 
inadvertent corruption. To this end the database itself is not 
affected until the COMMIT command is given. If an error 
occurs then a ROLLBACK command restores the state of 
the system to that at the conclusion of the previous COM- 
MIT command. A transaction is terminated by either a 
COMMIT command or ROLLBACK command combined 
with allowing other users access to the data. A ROLLBACK 
command requires buffering of all operations following a 



COMMIT command to permit restoration of the state fol- 
lowing the COMMIT command. 

[0007] If the transaction fails or a user cancels a transac- 
tion, a ROLLBACK results in clearing the buffered opera- 
tions and removing access restrictions to restore the database 
to its state prior to the initiation of the now failed transaction. 
On the other hand, a COMMIT command results in updating 
the database followed by clearing of the buffered operations. 

[0008] Another SQL command, SAVEPOINT, enables 
restoring the system to an earlier defined state that need not 
be the state at the conclusion of the previous COMMIT 
command. Like the COMMIT command in the context of 
the ROLLBACK command, SAVEPOINT provides a prior 
state of the system for the ROLLBACK command. Unlike 
the COMMIT command, however, the SAVEPOINT com- 
mand does not require changes to the database. Instead 
SAVEPOINT enables specification of a defined state for 
system restoration. In some embodiments the SAVEPOINT 
command specifies multiple earlier states distinguished by 
their respective identifiers. If desired, the system can be 
restored to one of the specified earlier states by executing a 
ROLLBACK to the specified state. If a COMMIT command 
is given then all buffered operations are cleared along with 
the states specified by the SAVEPOINT command. 

[0009] Implementing the SAVEPOINT or ROLLBACK 
commands requires considerable overhead since other com- 
mands must therefore provide buffering. On the other hand, 
it is not necessary to support buffering if the SAVEPOINT 
or ROLLBACK commands are not used. A typical applica- 
tion includes SQL statements in several files and a compiler 
compiles only one file at a time. Thus, it is not possible to 
decide when compiling a particular file whether buffering- 
related code is needed due to a statement in another file. 

[0010] SQL applications written using SQL statements 
and functions can be combined with source code in a 
programming language such as C++ in Embedded SQL 
("ESQL"). An ESQL application can include several source 
code files. The source files for an ESQL application are 
p reprocessed by a macro-preprocessor. Typically, the macro- 
preprocessor generates code for the various embedded SQL 
statements or introduces additional statements followed by a 
compiler compiling the output of the macro-preprocessor. 
Compiling a source file generates an object module corre- 
sponding to the source file. The linker links object modules 
to generate the executable program. 

[0011] Compiling a source code file includes several 
operations. A compiler parses the source code, carries out 
several checks to ensure conformity with the programming 
language specifications and then translates the parsed code 
to generate a lower level code such as machine code for 
execution on a computer. In some instances, the code is 
assembly or byte code that needs further translation for 
actual execution on a particular computer. A compiler allo- 
cates memory for each variable to properly translate source 
code to generate executable code. The compiler allocates 
memory for each variable in accordance with a "type" 
specification for the variable in question. 

[0012] Type information is specified in a "declaration" 
statement. Each variable is assigned a particular type. The 
compiler enters the type information for each variable into 
a symbol table associated with an object module. When 
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several object modules in the same executable share a 
variable it is important to ensure that only one module 
actually allocates memory for the variable. The compiler 
allocates memory in response to a "definition" statement for 
a particular variable. However, the declaration and/or defi- 
nition statements are allowed to be implicit in many pro- 
gramming languages. 

[0013] The "C" programming language permits an 
"extern" declaration in a source file that tells the compiler 
that memory for the specified variable is allocated in another 
file. Consequently, a C compiler only creates a variable entry 
in the symbol table that serves as a place holder for the 
variable but leaves the actual memory allocation to another 
file. The variable merely points to its entry in the symbol 
table and is redirected to the actual memory allocation 
following identification of the intended memory location. 
Thus, there are several declarations for a variable but there 
can be only one definition. No value can be assigned to a 
variable unless the variable is defined because there is no 
memory allocated to store it. 

[0014] Following compilation, a linker links the resultant 
object files to generate the executable for the application. 
The linkiog may be static or dynamic. In static linking the 
object files identified by the linker for the resolution of all 
variables are copied to generate an executable file. In 
contrast, dynamic linking allows fetching an object file at 
either load time or at runtime. Consequently, the same object 
module is used by several applications. As is evident, 
typically dynamic linking results in lower memory require- 
ments and smaller executable sizes. Furthermore, a pro- 
grammer can modify and recompile a dynamically-linked 
module independent of another module, thus making soft- 
ware maintenance easier and less expensive. 

[001 S] Declaring a variable with an "extern" keyword 
requires the linker to identify the actual memory allocated 
for the variable in other object modules. To this end the 
linker searches symbol tables associated with object mod- 
ules or libraries for a module providing a definition for the 
variable in question. This process is termed resolving the 
variable. Proper resolution of a variable is required before it 
can actually be used in an executable file. 

[0016] In software development projects a software appli- 
cation is refined over the life of the project. Through the 
development process, concepts concerning various prob- 
lems and solutions are often revised, and the functions and 
features of the final software application are often quite 
different from those at the beginning of the project. Support 
for additional features supporting execution of other state- 
ments in a non-procedural language statement reduces the 
execution efficiency of programs that do not use these 
additional features. On the other hand, adding distinct com- 
mands to provide the additional features results in complex 
programming languages with many statements differing 
only in the context in which they should be used. For 
example, if there is at least one command that requires 
buffering prior changes to a database in an SQL-based 
application, then implementations of other commands 
affecting the database need to support buffering. On the 
other hand, if no command requiring buffering is used in an 
application then the program overhead for buffering unnec- 
essarily slows down the application. 

[0017] As a programming language evolves to develop 
specific commands for a particular context, developers have 



to learn different commands for accomplishing similar tasks 
rather than preserving their existing familiarity with the 
programming tool. Similar sounding commands that differ 
in subtle but significant details increase the risk that a 
programmer inadvertently uses the less effective command. 
Such errors are difficult to identify since some may only 
sporadically result in bugs. Therefore, it is desirable to have 
a system and method for providing contextually efficient 
implementations for a programming language command that 
can be invoked automatically without requiring the pro- 
grammer to use different commands to invoke optimized 
implementations for different contexts. 

SUMMARY OF THE INVENTION 

[0018] In view of the foregoing, the present invention 
provides a method and system for selecting one of several 
implementations of a higher level programming language 
statement in response to the occurrence or non-occurrence of 
another statement in a computer program. The invention 
enables transparent selection of contextually efficient code. 
Thus, users and developers need not use different higher 
language statements to invoke a context-specific implemen- 
tation. 

[0019] A macro-preprocessor enables choosing a context 
sensitive implementation for a higher language statement. 
The macro -preprocessor introduces a first global variable 
declaration in response to identifying a specific context. In 
an embodiment, the specific context is defined by the 
presence of one or more specified statements in a source file 
processed by the macro-preprocessor. In an embodiment of 
the invention, the first global variable enables setting a 
desired value for a second variable by introducing the 
second variable in a first object module supplying the 
definition for the first global variable. This strategy provides 
for a level of indirection to include the first object module in 
response to identifying a specified context. In the event the 
linker does not include the first object module, an alternative 
definition for the second variable is provided in a second 
object module. 

[0020] In an embodiment of the invention a linker library 
object module is loaded using a wrapper based upon the first 
global variable. Moreover, an embodiment of the invention 
enables conditionally executing a first program sequence in 
response to the second variable value specifying a context of 
interest. 

[0021] For instance, the need to support the underlying 
implementation of a RESTORE command may require a 
DELETE command to include buffering deleted data. How- 
ever, if no RESTORE command is used in a program then 
there is no need to incur the overhead of buffering extensive 
information in the implementation of the DELETE com- 
mand. 

[0022] It should be noted that the invention, while illus- 
trated with SQL, is not limited to SQL or even non- 
procedural languages, but instead includes higher-level lan- 
guages and scripts. Such higher-level languages and scripts 
can benefit from using different binary, byte -code or macro 
implementations for the same command depending on a 
particular context. 

[0023] Additional features and advantages of the inven- 
tion will be made apparent from the following detailed 
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description of illustrative embodiments, which proceeds 
with reference to the accompanying figures. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0024] While the appended claims set forth the features of 
the present invention with particularity, the invention, 
together with its objects and advantages, may be best 
understood from the following detailed description taken in 
conjunction with the accompanying drawings of which: 

[0025] FIG. 1 is a block diagram generally illustrating an 
exemplary computing environment in which databases and 
other software structures are implemented along with 
higher-level languages being used to describe desired ser- 
vices, including services pertaining to the database; 

[0026] FIG. 2 is a flowchart summarizing an exemplary 
set of steps of preprocessing, compiling, linking and execut- 
ing a computer program in a computing environment; 

[0027] FIG. 3 is a flowchart summarizing exemplary steps 
of an embodiment that introduces statements in the program 
code to generate an efficient runtime executable in accor- 
dance with the invention; 

[0028] FIG. 4 is a flow diagram illustratively depicting 
compilation of a program using a preprocessor, a compiler 
and a linker; 

[0029] FIG. 5 is a flow chart illustrating an exemplary set 
of steps for introducing a global variable reflecting the 
context of command in an embodiment of the invention; 

[0030] FIG. 6 is a flow diagram illustratively depicting 
transformation of a program from high level instructions to 
executable code, including transformation to the pre-pro- 
cessed code and subsequent incorporation of particular 
object modules based upon a detected context; 

[0031] FIG. 7 is a flow diagram illustrating the different 
implementations for a second statement due to the occur- 
rence or non-occurrence of a first statement in a computer 
program in accordance with an embodiment of the inven- 
tion; and 

[0032] FIG. 8 illustrates exemplary linker libraries. 

DETAILED DESCRIPTION OF THE 
INVENTION 

[0033] Turning to the drawings, wherein like reference 
numerals refer to like elements, the invention is illustrated as 
being implemented in a suitable computing environment. 
Although not required, the invention will be described in the 
general context of computer-executable instructions, such as 
program modules, being executed in a computing environ- 
ment. Generally, program modules include routines, pro- 
grams, objects, components, data structures, etc. that per- 
form particular tasks or implement particular abstract data 
types. Moreover, those skilled in the art will appreciate that 
the invention may be practiced with other computer system 
configurations, including hand-held devices, multi-proces- 
sor systems, microprocessor based or programmable con- 
sumer electronics, network PCs, minicomputers, mainframe 
computers, and the like. The invention may also be practiced 
in distributed computing environments where tasks are 
performed by remote processing devices that are linked 
through a communications network. In a distributed com- 



puting environment, program modules may be located in 
both local and remote memory storage devices. 

[0034] FIG. 1 illustrates an example of a suitable com- 
puting system environment 100 on which the invention may 
be implemented. The computing system environment 100 is 
only one example of a suitable computing environment and 
is not intended to suggest any limitation as to the scope of 
use or functionality of the invention. Neither should the 
computing environment 100 be interpreted as having any 
dependency or requirement relating to any one or combina- 
tion of components illustrated in the exemplary operating 
environment 100. 

[0035] The invention is operational with numerous other 
general-purpose or special -purpose computing system envi- 
ronments or configurations. Examples of well-known com- 
puting systems, environments, and configurations that may 
be suitable for use with the invention include, but are not 
limited to, personal computers, server computers, hand-held 
or laptop devices, multiprocessor systems, microprocessor- 
based systems, set top boxes, programmable consumer elec- 
tronics, network PCs, minicomputers, mainframe comput- 
ers, and distributed computing environments that include 
any of the above systems or devices. 

[0036] The invention may be described in the general 
context of computer-executable instructions, such as pro- 
gram modules, being executed by a computer. Generally, 
program modules include routines, programs, objects, com- 
ponents, data structures, etc., that perform particular tasks or 
implement particular abstract data types. The invention may 
also be practiced in distributed computing environments 
where tasks are performed by remote processing devices that 
are linked through a communications network. In a distrib- 
uted computing environment, program modules may be 
located in both local and remote computer storage media 
including memory storage devices. 

[0037] With reference to FIG. 1, an exemplary system for 
implementing the invention includes a general-purpose 
computing device in the form of a computer 110. Compo- 
nents of the computer 110 may include, but are not limited 
to, a processing unit 120, a system memory 130, and a 
system bus 121 that couples various system components 
including the system memory to the processing unit 120. 
The system bus 121 may be any of several types of bus 
structures including a memory bus or memory controller, a 
peripheral bus, and a local bus using any of a variety of bus 
architectures. By way of example, and not limitation, such 
architectures include Industry Standard Architecture (ISA) 
bus, Micro Channel Architecture (MCA) bus, Enhanced ISA 
(EISA) bus, Video Electronics Standards Association 
(VESA) local bus, and Peripheral Component Interconnect 
(PCI) bus, also known as Mezzanine bus. 

[0038] The computer 110 typically includes a variety of 
computer-readable media. Computer-readable media can be 
any available media that can be accessed by the computer 
110 and include both volatile and nonvolatile media, remov- 
able and non-removable media. By way of example, and not 
limitation, computer-readable media may include computer 
storage media and communications media. Computer stor- 
age media includes volatile and nonvolatile, removable and 
non-removable media implemented in any method or tech- 
nology for storage of information such as computer-readable 
instructions, data structures, program modules, or other data. 
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Computer storage media include, but are not limited to, 
random-access memory (RAM), read-only memory (ROM), 
EEPROM, flash memory, or other memory technology, 
CD-ROM, digital versatile disks (DVD), or other optical 
disk storage, magnetic cassettes, magnetic tape, magnetic 
disk storage, or other magnetic storage devices, or any other 
medium which can be used to store the desired information 
and which can accessed by the computer 110. Communica- 
tions media typically embody computer- readable instruc- 
tions, data structures, program modules, or other data in a 
modulated data signal such as a carrier wave or other 
transport mechanism and include any information delivery 
media. The term "modulated data signal" means a signal that 
has one or more of its characteristics set or changed in such 
a manner as to encode information in the signal. By way of 
example, and not limitation, communications media include 
wired media such as a wired network and a direct -wired 
connection and wireless media such as acoustic, RF, and 
infrared media. Combinations of the any of the above should 
also be included within the scope of computer- readable 
media. 

[0039] The system memory 130 includes computer stor- 
age media in the form of volatile and nonvolatile memory 
such as ROM 131 and RAM 132. A basic input/output 
system (BIOS) 133, containing the basic routines that help 
to transfer information between elements within the com- 
puter 110, such as during start-up, is typically stored in ROM 
131. RAM 132 typically contains data and program modules 
that are immediately accessible to or presently being oper- 
ated on by processing unit 120. By way of example, and not 
limitation, FIG. 1 illustrates an operating system 134, appli- 
cation programs 135, other program modules 136, and 
program data 137. Often, the operating system 134 offers 
services to applications programs 135 by way of one or more 
application programming interfaces (APIs) (not shown). 
Because the operating system 134 incorporates these ser- 
vices, developers of applications programs 135 need not 
redevelop code to use the services. Examples of APIs 
provided by operating systems such as Microsoft's "WIN- 
DOWS" are well known in the art. 

[0040] The computer 110 may also include other remov- 
able/non-removable, volatile/nonvolatile computer storage 
media. By way of example only, FIG. 1 illustrates a hard 
disk interface 140 that reads from and writes to non- 
removable, nonvolatile magnetic media, a magnetic disk 
drive 151, which may be internal or external, that reads from 
and writes to a removable, nonvolatile magnetic disk 152, 
and an optical disk drive 155 that reads from and writes to 
a removable, nonvolatile optical disk 156 such as a CD 
ROM. Other removable/non-removable, volatile/nonvolatile 
computer storage media that can be used in the exemplary 
operating environment include, but are not limited to, mag- 
netic tape cassettes, flash memory cards, DVDs, digital 
video tape, solid state RAM, and solid state ROM. The hard 
disk drive 141, which may be internal or external, is typi- 
cally connected to the system bus 121 through a non- 
removable memory interface such as interface 140, and 
magnetic disk drive 151 and optical disk drive 155 are 
typically connected to the system bus 121 by a removable 
memory interface, such as interface 150. 

[0041] The drives and their associated computer storage 
media discussed above and illustrated in FIG. 1 provide 
storage of computer-readable instructions, data structures, 



program modules, and other data for the computer 110. In 
FIG. 1, for example, hard disk drive 141 is illustrated as 
storing an operating system 144, application programs 145, 
other program modules 146, and program data 147. Note 
that these components can either be the same as or different 
from the operating system 134, application programs 135, 
other program modules 136, and program data 137. The 
operating system 144, application programs 145, other pro- 
gram modules 146, and program data 147 are given different 
numbers here to illustrate that they may be different copies. 
A user may enter commands and information into the 
computer 110 through input devices such as a keyboard 162 
and pointing device 161, commonly referred to as a mouse, 
trackball, or touch pad. Other input devices (not shown) may 
include a microphone, joystick, game pad, satellite dish, and 
scanner. These and other input devices are often connected 
to the processing unit 120 through a user input interface 160 
that is coupled to the system bus, but may be connected by 
other interface and bus structures, such as a parallel port, 
game port, or a universal serial bus (USB). A monitor 191 or 
other type of display device is also connected to the system 
bus 121 via an interface, such as a video interface 190. In 
addition to the monitor, computers may also include other 
peripheral output devices such as speakers 197 and printer 
196, which may be connected through an output peripheral 
interface 195. 

[0042] The computer 110 may operate in a networked 
environment using logical connections to one or more 
remote computers, such as a remote computer 180. The 
remote computer 180 may be a personal computer, a server, 
a router, a network PC, a peer device, or other common 
network node, and typically includes many or all of the 
elements described above relative to the computer 110, 
although only a memory storage device 181 has been 
illustrated in FIG. 1. The logical connections depicted in 
FIG. 1 include a local area network (LAN) 171 and a wide 
area network (WAN) 173, but may also include other 
networks. Such networking environments are commonplace 
in offices, enterprise -wide computer networks, intranets, and 
the Internet. 

[0043] When used in a LAN networking environment, the 
computer 110 is connected to the LAN 171 through a 
network interface or adapter 170. When used in a WAN 
networking environment, the computer 110 typically 
includes a modem 172 or other means for establishing 
communications over the WAN 173, such as the Internet. 
The modem 172, which may be internal or external, may be 
connected to the system bus 121 via the user-input interface 
160, or via another appropriate mechanism. In a networked 
environment, program modules depicted relative to the 
computer 110, or portions thereof, may be stored in a remote 
memory storage device. By way of example, and not limi- 
tation, FIG. 1 illustrates remote application programs 185 as 
residing on memory device 181, which may be internal or 
external to the remote computer 180. It will be appreciated 
that the network connections shown are exemplary and other 
means of establishing a communications link between the 
computers may be used. 

[0044] In the description that follows, the invention will 
be described with reference to acts and symbolic represen- 
tations of operations that are performed by one or more 
computers, unless indicated otherwise. As such, it will be 
understood that such acts and operations, which are at times 
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referred to as being computer-executed, include the manipu- 
lation by the processing unit of the computer of electrical 
signals representing data in a structured form. This manipu- 
lation transforms the data or maintains them at locations in 
the memory system of the computer, which reconfigures or 
otherwise alters the operation of the computer in a manner 
well understood by those skilled in the art. The data struc- 
tures where data are maintained are physical locations of the 
memory that have particular properties defined by the format 
of the data. However, while the invention is being described 
in the foregoing context, it is not meant to be limiting as 
those of skill in the art will appreciate that various of the acts 
and operations described hereinafter may also be imple- 
mented in hardware. 

[0045] An embodiment of the present invention illustrated 
in FIG. 2 enables a software developer to efficiently develop 
applications suitable for particular applications, including 
those for accessing, managing and otherwise utilizing data- 
bases. During step 200 a programmer or developer generates 
program code comprising procedural and non-procedural 
programming languages. Examples of suitable program- 
ming languages include ESQL that allows embedding SQL 
commands in programs written in the C programming 
language. The program code is pre-processed by a macro- 
preprocessor during step 205. During step 205 the macro- 
preprocessor examines the source code for an occurrence of 
one or more specified statements. During step 210 any one 
of the sp ecified s tatements is detect e_d.~Tfhe'ft the~;macro- 
p re pro cesser intrndflcps a decoration for a first va riable that 
is e ffificitly not defined. In the c ase o i' ij&pL the firs? 
variab le is declared using the "extern" key word to indicate 
to a"comp7ler that no storage shoul d be allocated since it 
w6uld be allocat ed in another moduli T he macro-prep ro- 
cSSsiH' als o inse rts s tatements that are in tHe form of func tion 
calls coiflp ljant with the O pen Data Base Connectivity 
("ODBC") standard during step 215, but may in clude other 
mechanisms. Step 215 provides the proprietary implemen- 
tationsibr -a- particular database since, as explained earlier, 
the actual implementation of a non-procedural language 
statement is not specified. 

[0046] During step 220 a compiler converts the macro- 
preprocessor output to low-level instructions. Next, during 
step 225 a linker starts to resolve references using a library 
having a first object module that includes a definition for the 
first variable introduced by the macro -preprocessor al step 
210. If the linker detects during step 230 that the variable 
introduced by the macro -preprocessor has to be resolved 
then control transfers to step 235. During step 235 the linker 
links the first object module having a definition for the 
declared variable introduced by the macro -preprocessor 
from the appropriate library. On the other hand, during step 
230 if the linker does not detect the variable introduced by 
the macro-preprocessor then control is transferred to step 
240. During step 240 the linker does not link in the first 
object module since the first variable does not need to be 
resolved. As is readily apparent, the presence or absence of 
the first module in the executable is strictly dependent on the 
presence or absence of the specified statement tested in step 
205. 

[0047] FIG. 3 illustrates an embodiment of the invention 
enabling setting the value of a variable to a non-default value 
in response to detecting a specified statement. Steps 300 and 
305 of FIG. 3 correspond to steps 225 and 230 respectively 



of FIG. 2. If the declaration introduced by the macro- 
preprocessor does not have a definition, then control passes 
from step 305 to step 310. During step 310 the linker 
includes a first object module to resolve the first variable. 
Furthermore, the first object module introduces a non- 
default value for a second variable. 

[0048] If the linker does not encounter a declaration for 
the first variable, i.e., the macro-preprocessor did not detect 
the specified statement, then there is no first variable to 
resolve and control passes to step 315 from step 305. During 
step 315, the linker bypasses the first module because the 
first variable does not have to be resolved. During step 320 
the linker includes a second object module to carry out a 
command, such as a DELETE command, in the source code. 
If the second object module includes a declaration for the 
second variable without a definition, as determined in step 
325, then the control shifts to step 330. 

[0049] The determination of the second variable declara- 
tion during step 325 results from the second variable's 
presence in a symbol table for the second object module and 
the absence of a corresponding memory allocation. During 
step 330 the linker continues to scan the linker libraries in an 
effort to resolve the second variable. Step 335 includes 
inclusion of a third module by the linker to resolve the 
second variable. The third object module, that is also the last 
module in the linker library, provides a definition setting a 
default value, e.g., 0, for the second variable. This value is 
in contrast with the non-default value set in the first object 
module. The linker did not include the first module since the 
first object module was encountered prior to the entry of the 
unresolved reference to the second variable included during 
step 325. 

[0050] It should be noted that while the linker encounters 
the first module prior to the second module, the third object 
module is encountered after the second module. On the other 
hand, during step 325 if the linker does not detect a second 
variable to be resolved, then control passes to step 340. 
During step 340 the linker continues without resolving the 
second variable or including the third object module. 

[0051] A software application includes one or more objectH 
modules that often correspond to source code files as is J 
shown in FIG. 4. A software program 400 has one or more I 
source files 405 corresponding to the object modules. Some ' 
of the source files include commands in a higher-level 
language, such as function calls or even instructions in 
scripted language. Code corresponding to each of these 
commands is substituted to actually implement the instruc- 
tion. Thus, a macro-preprocessor 415 converts the computer 
program 400 having files 405 in FIG. 4 to yield prepro- 
cessed program code comprising files 410 and possibly 
additional statements 420 in a compilable language. 

[0052] An example of such a system is the SQL languageT 
and its extension in ESQL. SQL ensures that some stan- 1 
dardized tasks can be performed without locking users into 
particular implementations. ESQL enables embedding SQL 
statements in one or more higher-level languages. A macro- 
preprocessor replaces the embedded SQL statements by 
implementation-specific code compiled along with the 
higher-level language statements. Thus, an exemplary ESOj ^ 9 
processor works by reading„C_ian guage statements with 
interspersed Structured Que^_l^ng^gej(SQL) statemeju^. & 
The" SQL statements _ are converted jnto O pe n Databas e 
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Connectivity (ODBC) compliant calls . The resulting source ft 
code ls^ompilea ana united. For illuslration purposes, FIG. i3 
4 shows the result of compiling the preprocessed program 
code 420 by compiler 425 to generate object code. This 
object code includes, in the various object modules con- 
tained therein, unresolved references 430, compiled code 
435 corresponding to the files 410 and the compiled macro - 
preprocessor introduced statements 440. 

[0053] This object code is subsequently, acted upon by a 
linker 455. Linker 455 supplies additional object modules to 
resolve unresolved references 430 by providing object code 
from libraries 445 and additional object files 450 that are 
included by the user. The linker 455 also ensures that the 
different object modules have the proper offsets relative to 
each other to allow executiq nof a single executable 460. T he 
executable 460 is Txearted^irl an" environment similar to 
computing environment 100. 

[0054] In an embodiment of the present invention, a first 
statement, such as SAVEPOINT, in a program requires a 
different implementation for a second statement such as 
DELETE. Two possible implementations for the second 
statement independently designate performance characteris- 
tics at runtime. These implementations are provided in 
different object modules corresponding to the same instruc- 
tion or statement. Thus, it is desirable that the code that 
actually implements the additional program instructions 
including the second statement should be sensitive to the 
occurrence or non-occurrence of the first statement. 

[0055] Instead of requiring developers to examine all 
source code files to discover an occurrence of the first 
statement, a macro -preprocessor discovers an occurrence of 
the first statement. The macro -preprocessor is designed to 
respond to a context defined by the occurrence or non- 
occurrence of one or more statements of interest. Further- 
more, modified linker libraries include object modules for 
resolving variables introduced by the macro-preprocessor. 
Additional modifications to the linker libraries allow object 
modules in the linker libraries to use one or more of the 
global variables representing a context while selecting code 
for execution at runtime. 

[0056] Exemplary embodiments in accordance with the 
invention are described herein below without intending to 
limit the invention to these embodiments. In an embodiment 
of the invention the fact that a variable that is declared but 
not defined is set to default value, e.g., 0, is used to select 
code for execution. Thus, if the variable is given a non- 
default value upon encountering a statement of interest then 
code relevant to the statement of interest is executed. Select- 
ing code based on the value of the variable results in faster 
code although without reduction in the footprint of the 
executable. The following pseudo-code illustrates such a 
variable to conditionally execute a code segment: 

[0057] IF (_ F I R S T_STATE M E NT_ D ETE CTE D == 
0) THEN 

[0058] {Execute efficient code for implementing sec- 
ond statement because the first statement is not being 
used} 

[0059] ELSE 

[0060] {Execute the code with the overhead for 
implementing the second statement because the first 
statement was detected} 

[0061] END 



[0062] The variable _FIRST_STATEMENT_DETECTED 
is declared and defined in a statement introduced by the 
macro-preprocessor if the macro-preprocessor encounters 
the first statement in any of the program files. 

[0063] In the context of ESQL the implementation of the 
SQL statement, SAVEPOINT, provides an illustration of a 
global variable to flag a particular context. SAVEPOINT 
allows restoration of an earlier state, i.e., undoing a set of 
operations on a database. Therefore, if SAVEPOINT is used 
then the various state defining parameters need to be saved 
as other commandsifeiaienieriis are exec uteuTUuim d o t ua iu^ - 
SAVffi OlJN 1 the fcSQL processor injects a d eclaiatron-tBto- 
the C stream of the form: " — 

[0064] extern int_OCC_SAVEPOINT_USED; 

[0065] The "extern" keyword informs the compiler that 
storage for _OCC_SAVEPOINT_USED is allocated in 
another file. Therefore, the compiler does not initialize 
_OCC_SAVEPOINT_USED. The linker uses two or more 
libraries such that the first library used by the linker contains 
as its first object module compiled code corresponding to the 
source code: 

[0066] int_OCC_SAVEPOINTJJSED « 1; 

[0067] int _OCC_SAVEPOINT_ENABLED = 1; 

[0068] and the second library contains in its last 
object module compiled code corresponding to the 
source code: 

[0069] int _OCC_SAVEPOINT_ENABLED = 0; 

[0070] In the linking process, if the macro -preprocessor 
injects a declaration for variable _OCC_SAVEPOIN- 
TJJSED, the linker includes the first object module to 
provide a definition. The first object module provides a 
declaration and a value for _OCC_SAVE- 
POINT_ENABLED as illustrated. Subsequently, other 
object modules in the library include instructions that test 
variable _OCC_SAVEPOINT_ENABLED to flag whether 
SAVEPOINT has been used in any of the source files. If 
_OCC_SAVEPOINT ENABLED is set, processing for 
SAVEPOINT will proceed. If _OCC_SAVEPOINT_USED 
is not set, the other object modules will not incur processing 
for SAVEPOINT. 

[0071] In another embodiment of the invention, any pro- 
gram object module using the SAVEPOINT feature includes 
the declaration; 

[0072] int_OCC_SAVEPOINT_NOT_USED; 

[0073] A variable declaration assumes that upon first 
encountering the variable the compiler sets the variable to 
zero by default unless the contrary is indicated. However, 
this is not a requirement. And, multiple declarations in other 
object modules are harmless. The linker uses a library 
containing an object module having compiled code corre- 
sponding to the following code: 

[0074] int_OCC_SAVEPOINT_NOT_USED = 1; 

[0075] The macro-preprocessor declares _OCC_SAVE- 
POINT_NOT_USED resulting in the linker including the 
library object" module setting _OCC_SAVEPOINT_NO- 
TJJSED to 1 only if _OCC_SAVEPOINT_NOT_USED is 
not declared elsewhere in the main program. (Note carefully 
the logical NOT: if the variable is not used in the main 
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program, it is set to one or TRUE.) Then any other object 
module in the library can test _OCC_SAVEPOINT_NO- 
T_USED to decide if SAVEPOINT is used in any module in 
the program. 

[0076] In another embodiment of the invention, a program 
object module using the SAVEPOINT feature will include a 
series of declarations having a global scope: 

[0077] int _OCC_SAVEPOIOT_USED; 

[0078] The first object module in the library contains 
compiled code corresponding to 

[0079] int_OCC_SAVEPOINT_USED - 1; 

[0080] int_OCC_SAVEP01NT_MODULE_l; 

[0081] int_OCC_SAVEP01NT_MODULE_2; 

[0082] ... and so on up to the number of separate 
object modules containing code dedicated for imple- 
menting SAVEPOINT. Each of the separate object 
modules for implementing SAVEPOINT contain a 
matching definition 

[0083] int_OCC_SAVEPOINT_MODULE__l - 1; 

[0084] The first object module is included in the execut- 
able program [Please clarify the reason for the linker includ- 
ing the first object module since _OCC_SAVEPOIN- 
T_USED is presumably set to 0 by the compiler and does not 
need to be resolved by the linker. The same reason will 
presumably apply to the remaining object modules. If not, 
please clarify.]. In this way, those object modules containing 
code for SAVEPOINT can be included in the executable 
when required. 

[0085] The invention uses well known rules for prepro- 
cessing, compiling and linking computer programs, particu- 
larly programs using the C or C++ programming language to 
improve the development of application programs. It pro- 
vides a method for developing a computer program using a 
non-procedural programming language. The method 
includes declaring at least one first variable in a first source 
file responsive to detecting a first statement conforming to 
the non-procedural programming language. A macro-pre- 
processor examining the source code introduces a declara- 
tion statement for the first variable. A compiler compiles the 
first source file to generate a first object module followed by 
linking using at least one library. The linker includes a 
second object module containing a definition for the first 
variable to resolve the first variable. The second object 
module includes access to code to support implementation 
of the first statement. This access includes references to 
functions that are invoked by other statements to ensure 
proper execution of the first statement. 

[0086] Furthermore, an additional non-procedural pro- 
gramming language is used to provide a third statement 
conforming to the additional non-procedural programming 
language. 

[0087] The flowchart in FIG. 5 describes an embodiment 
of the invention that enables including object modules from 
linker libraries to support additional overhead in the imple- 
mentation of a DELETE statement. The additional overhead 
is required by the execution of one or more additional 
specified statements such as SAVEPOINT. In step 500 a 
computer program having the DELETE non-procedural pro- 



gramming language statement is pre-processed. However, 
the additional buffering overhead needs to be incurred only 
if the SAVEPOINT command or the ROLLBACK command 
is used in the computer program. 

[0088] To flag the need for incurring an overhead the 
pre-processor introduces a declaration for a variable _VAR1 
as being an "extern" upon encountering a ROLLBACK or 
SAVEPOINT statement during step 505. Declaring the vari- 
able to be "extern," in a C like programming language, 
informs the compiler that the variable definition is in another 
file external to the file being compiled. Consequently, the 
compiler does not initialize the variable during compilation 
in step 510. 

[0089] Following compilation, the object files generated 
by the compiler during step 510 are linked in step 515 using 
a linker program that resolves variable references in the 
object files. The linker detects if _VAR1 lacks a definition 
during step 520. If _VAR1 lacks a definition then the linker 
resolves _VAR1 by examining the linker libraries for an 
object module having a definition for _VAR1, i.e., specify- 
ing memory for _VAR1. A value for a variable can be 
specified only after memory allocation for storing the vari- 
able value. During step 525 the linker encounters a first 
linker library Vj having an object module M 3 that provides 
a definition setting _VAR1 to 1 along with a definition 
setting an additional global variable _VAR2 to a non -default 
value of 1 . During step 525 inclusion of object module Mj 
by the linker results in automatically including _VAR2 in a 
symbol table for the program being created by the linker. 
Other object modules included by the linker can include 
instructions to test the value of _VAR2 to detect if object 
module M 1 has been included. Notably, _VAR2 does not 
occur in the computer program, but is used in one or more 
of the additional object modules included in the linker 
libraries. 

[0090] In the absence of an unresolved occurrence of 
_VAR1 in the pre-processed code, the linker does not 
include object module Mj, Instead control is transferred to 
step 530. During step 530, the occurrence of _VAR2 in other 
object modules included by the linker results in the linker 
including an object module M 2 to provide a declaration for 
_VAR2. The object module M 2 and can be given the 
same name but are not in the same linker library, V lt Thus, 
the linker includes only one of modules M a or M 2 . In other 
words, the linker includes object module M x prior to encoun- 
tering object module M 2 if a SAVEPOINT or ROLLBACK 
command is encountered. Otherwise, object module M 2 is 
included, thus precluding any need to include object module 
Mj (step 525), The executable constructed by this procedure 
differs in its size based on whether SAVEPOINT or ROLL- 
BACK commands are being used, 

[0091] FIG. 6 further illustrates an embodiment in accor- 
dance with the invention. In FIG. 6 preprocessed program 
code 600 includes files such as file 605 and file 610 with file 
605 having some statements 615 introduced by the prepro- 
cessor. Preprocessed code 600 is compiled to obtain object 
code 630 having object module, e.g., object module 620 and 
object module 625. Linker links object modules using link 
libraries. The linked version includes conditional on an 
unresolved declaration of _VAR1 object module M x 640 
resulting in a fat version 650. The object module M 1 
provides support for the extra overhead in response to the 
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macro -preprocessor declaration 615 in the preprocessed 
program code 600. In contrast the absence of the macro- 
preprocessor declaration 615 results in the inclusion of 
object module M 2 645 and generation of a thin version 655. 

[0092] FIG. 7 is a flowchart that tracks the implementa- 
tion of a statement AA in a computer program in accordance 
with the invention. Statement AA has at least two possible 
code implementations that are suited to contexts defined by 
the presence or absence of another statement BB. If state- 
ment BB is present then implementation code Al is pre- 
ferred while the absence of statement BB results in imple- 
mentation A2 being preferred. The preprocessing of the 
computer program begins at step 700. During step 705 the 
macro-preprocessor examines source code to detect the 
presence of statement BB. If statement BB is present then 
the pre-processor introduces a declaration for a variable 
_VAR1 during step 710. On the other hand, a failure to 
detect statement BB results in no such declaration being 
introduced during step 715, The compiler compiles the 
pre-processor output during steps 720 and 725 following 
steps 710 and 715 respectively. 

[0093] Following compilation, the linker does not locate a 
definition of _VAR1 because it does not occur elsewhere in 
the computer program. Consequently, during step 730, 
which follows step 725, the linker does not include the first 
object module Mi. Object module M x has a definition for 
_VAR1 and a declaration and definition setting another 
variable _VAR2 to a non-default value where _VAR2 is 
found in object modules in the linker libraries but not in the 
computer program. On the other hand, introduction of 
_VAR1 by the macro-preprocessor results in the inclusion of 
object module M 2 by the linker during step 735, which 
follows step 720, since the linker encounters the object 
module Mj prior to the object module M 2 . Note that object 
module M 2 is the last object module in each linker library 
used by the linker and introduces a default value, such as 0 
for _VAR2. 

[0094] Following inclusion of object module M : the linker 
does not include object module M 2 . In contrast, since object 
module M 2 includes a definition of _VAR2, it is included if 
object module M a is not included to resolve references to 
_VAR2 in other object modules. Thus, the inclusion of M a 
and M2 by the linker is on a mutually exclusive basis. M A 
provides access to code supporting added functionality 
required by the statement AA if statement BB occurs in the 
computer program. M 2 , on the other hand, has no such 
functionality. 

[0095] In an embodiment of the invention, an application 
uses a global variable to alter its behavior based on the use 
or non-use of a feature. Because the detection of the value 
of the global variable's value occurs at run-time, the code 
supporting both cases is linked into the application as 
illustrated below: 



[0096] The application is not linked against either library. 
Instead, statically linked wrappers with those function 
names are provided. These wrappers explicitly load the 
correct library and obtain the address of the requested 
function within that library. Thereafter, they merely forward 
all calls to the dynamically loaded function. 

[0097] In order to produce an executable, the linker must 
include both do_thing_the_XYZ_way( ) and do_thing_th- 
e_other_way( ). In order to reduce the runtime footprint the 
implementations of do_thing_the_XYZ_way( ) and do_th- 
ing_the_other_way( ) are placed in separately named 
dynamically loaded libraries. The program itself makes calls 
to do jhing_the_XYZ_way( ) with that entry -point resolved 
at link-time from a library containing the wrapper described 
immediately above. The first time the wrapper is invoked at 
run-time, the wrapper loads the appropriate dynamically 
loaded library, finds and stores the entry-point of the same- 
name function within that library, and invoked that entry- 
point. At subsequent invocations of the wrapper, the stored 
entry-point is used immediately with no additional over- 
head. 

[0098] In an exemplary embodiment of the invention, 
given that a stub routine do Jhing( ) is statically linked into 
the executable, a dynamically-linked version of the code is 
as follows: 



ini urst_ca]i - 0; 

if( ftrst_call) { 
fiist_call • 1; 

prop er_rou tine - choosc_thing( ); 

} 

proper_routine( ); 

where choose_thing( ) consists of the following code: 
choose_thing( ) { 

ifC_FEATURE_XYZ_USED) { 
op en_dynamic_l ibrar y (XYZ) ; 
return pointer to do_thing_lhe_XYZ_way( ); 
} else { 

opeiL_dynamicJibrary(not_XYZ); 

return point to do_thing_thc_othcr_way( ); 

} 



[0099] In other words, in the first instance where invoking 
either do_thing_the_XYZ_way( ) or do_thing_the_other- 
_way( ) results in the actual call being to the general 
choose_thing( ) function. In turn, choose_thing( ) checks the 
_FEATURE_XYZ_USED flag, opens the appropriate ver- 
sion of the dynamic library and loads the correct version of 
the routine. The choose_thing( ) routine returns the correct 
version, that overloads some other name, such as proper- 
_routine( ). Then, upon calling the proper_routine( ) results 
in aliasing it to the correct version of the routine from the 
correct library. 



if (_FEATURE_XYZ_USED) {/• Or, if(!_FEATURE_XYZ__NOT_USED) V 
do_thing_the_XYZ_way( ); 

} else { /• feature XYZ not used V 
do_thing__the_other_way( ); 

} 
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[0100] In an alternative exemplary embodiment of the 
invention, very early in the executable program, the follow- 
ing code is executed: 



main( ) { 

if(_FEATURE_JCYZ_USED ) { 
load_d)(do_XYZ_things); 

} else { 

loaded] (do_other_things); 

} 



[0101] where load_dl( ) opens and loads a dynamic 
library. In this case, both the dynamically -load able libraries 
do_XYZ things and do_other_things contain the same entry 
points, except that in the first case they are written to use 
feature XYZ. After the appropriate dynamic library is 
loaded, the correct versions of the routines are used on 
subsequent calls. 

[0102] In another embodiment of the invention, the wrap- 
per includes a check on the global variable's state and 
reports an error if the application attempts to call the 
v function in the wrong state. For instance, the do_thing_th- 
e_XYZ_way( ) would report an error if it found _FEATUR- 
E_XYZ_USED was false. 

[0103] By avoiding the implicit link to both versions of the 
do thing functionality (in separate libraries), "snapping" the 
links to the library with known entrypoints that are not going 
to be used based on link-time recognition is avoided with a 
reduction in the start-up overhead and memory consump- 
tion. 

[0104] "Snapping" the links refers to run-time resolution 
of entry-points left unresolved at load-time. Entry-points 
flagged at load-time for resolution at run-time are stored in 
a special area of the program image file. When the program 
image file is loaded into memory at run-time, the entry-point 
linkages in this special area are filled with the correct 
addresses of the actual function entry -points in the dynami- 
cally loaded library containing the entry point linkages. This 
differs from load-time linking, where the address of the 
actual function is known at load-time. The operating sys- 
tem's program loader handles the "snapping" of links 
responsible for loading a program image into memory for 
execution. 

[0105] Another embodiment indirectly calls the wrapper 
function through a function pointer-table entry. A ifrer the 
wrapper dynamically loads the appro priate library, it 
ch ai ses the fuiiiliuu puiutei-la trre- entry to point directJNTto_ 
the j pTi^hing T| iprtjnr in the_inad £d library. The wrapper 
appears in a statically-linked library (i.e. its address is 
resolved at load-time). Following initialization of the above- 
described table, an entry "n" contains the address of the 
wrapper. If a program built in accordance with this embodi- 
ment wants to perform do_thing( ), then rather than directly 
invoking an entry-point (that is resolved at load time or 
snapped from some dynamic library at run time), the pro- 
gram invokes the function pointed to by entry "n" of the 
table. The first time this invocation occurs, the entry is 
pointing at the wrapper described above. This wrapper 
replaces the entry with a pointer to the correct function (i.e. 
the one indicated by the value of _FEATURE_XYZJJSED) 



and then calls that function. On subsequent invocations 
(invocation via entry "n" in the table), the correct function 
is called directly and without invoking the wrapper. 

[0106] A "helper" function appears in the statically-linked 
library. This helper performs the functions of the aforemen- 
tioned wrapper function for every entry in the table. The 
helper function checks the appropriate _FEATURE_xxxx- 
_USED variables, loads the matching library, finds the 
matching function name, and places the address of that 
entry -point into the appropriate table entry. If a program 
built in accordance with this embodiment wants to perform 
do_thing( ), then rather than directly invoking an entry -point 
(that is resolved at loadtime or snapped from some dynamic 
library at run time), the program invokes the function 
pointed to by entry "n" of the table. The first time any 
function available through this table is invoked (say, entry 
"i"), the entry is pointing at the wrapper described above. 
The wrapper invoked helper function replaces each entry in 
the table with a pointer to the correct function. The wrapper 
then calls the function via entry "i." On subsequent invo- 
cations (invocation via any entry "j" in the table), the correct 
function is called directly without the need to invoke the 
wrapper. 

[0107] The preceding description illustrates the selection 
of different object modules by the linker to allow the same 
command to use two different implementations in a manner 
responsive to the program code environment. The invention 
is not limited to the embodiment described above or to the 
proposed implementation of the SAVEPOINT command. 
Other commands of interest can be similarly handled. The 
earlier illustrations of other ways for the macro-preprocessor 
to introduce statements are also easily adapted to result in 
the context-sensitive exclusion or inclusion of a particular 
object module. 

[0108] Moreover, implementations of the invention 
include computer-readable medium having computer 
executable instructions for performing the steps of a method 
for constructing a computer program developed. The com- 
puter-readable medium has computer executable instruc- 
tions for performing the step of declaring the first variable in 
the first source file by insertion of a first variable declaration 
statement by a macro-preprocessor responsive to the detec- 
tion of the first statement of the non-procedural program- 
ming language. 

[0109] The design of suitable linker libraries is modified in 
accordance with the invention. FIG. 8 illustrates an embodi- 
ment of linker libraries with object modules and libraries 
corresponding to the order in which the libraries are used to 
resolve references. Naturally, first the unresolved references 
in the source program code are resolved followed by refer- 
ences that need to be resolved due to object modules so 
included. In FIG. 8 a first linker library V x 800 has a first 
object module M x 805. The first object module Mj 805 has 
a definition for JVAR1 810 and another variable _VAR2 
with a non-default value 815. The definition for _VAR2 
implicitly introduces a declaration because memory alloca- 
tion requires knowledge of the type for the variable. Another 
linker library V m 825 has object modules M a 830. In par- 
ticular, linker library V m 825 has an object module M p 835 
having a definition 840 setting _VAR2 to a default value. 

[0110] Typically, the linker encounters the object module 
M A 805 earlier than any other object module, particularly 
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object module M p 840. Advantageously, object module M p 
840 is implemented as the last object module in a linker 
library to ensure the correct order of processing. 

[0111] In view of the many possible embodiments to 
which the principles of this invention may be applied, it 
should be recognized that the embodiments described herein 
with respect to the drawing FICs is meant to be illustrative 
only and should not be taken as limiting the scope of 
invention. For example, those of skill in the art will recog- 
nize that the elements of the illustrated embodiment shown 
in software may be implemented in hardware and vice versa 
or that the illustrated embodiment can be modified in 
arrangement and detail without departing from the spirit of 
the invention. Therefore, the invention as described herein 
contemplates all such embodiments as may come within the 
scope of the following claims and equivalents thereof. 

We claim: 

1. A method of generating an executable from a computer 
program having a plurality of statements in a source code, 
the method comprising the steps of: 

inserting, by a macro-preprocessor, a declaration state- 
ment for a first variable in a first source file in the 
computer program in response to detecting a first 
statement thereby rendering a modified first source file; 

compiling the modified first source file to generate a first 
object module; 

linking the first object module to generate an executable 
program, the linking step including resolving the first 
variable by linking a second object module having a 
definition corresponding to the first variable; and 

providing a third object module in a linker library, the 
third object module corresponding to a second state- 
ment and including code supporting the first statement. 

2. The method of claim 1, wherein the computer program 
includes statements conforming to a non-procedural pro- 
gramming language. 

3. The method of claim 2, wherein the computer program 
is created using at least one statement conforming to an 
embedded structured query language. 

4. The method of claim 1, wherein the first variable 
declaration excludes a definition for the first variable. 

5. The method of claim 1 , wherein the code to support the 
first statement in the third object module includes instruc- 
tions to carry out additional tasks required by the first 
statement when implementing the second statement. 

6. The method of claim 5, wherein the code to support the 
first statement is accessed in the third object module from 
the second object module. 

7. The method of claim 2, wherein an additional non- 
procedural programming language is used to provide a third 
statement conforming to the additional non-procedural pro- 
gramming language, wherein in response to detecting the 
third statement a second variable is declared in at least one 
source file in the computer program. 

8. The method of claim 1 wherein the macro-preprocessor 
does not declare the first variable in the modified first source 
file if the macro -pre processor does not detect the first 
statement in the first source file whereby the first object 
module is not linked with the second object module. 



9. The method of claim 7, wherein a fourth object module 
having a definition for the second variable is linked instead 
of the second object module. 

10. A computer-readable medium having computer 
executable instructions for performing the steps of a method 
of generating an executable from a computer program 
having a plurality of statements in a source code, the method 
comprising the steps of: 

inserting, by a macro -preprocessor, a declaration state- 
ment for a first variable in a first source file in the 
computer program in response to detecting a first 
statement thereby rendering a modified first source file; 

compiling the modified first source file to generate a first 
object module; 

linking the first object module to generate an executable 
program, the linking step including resolving the first 
variable by linking a second object module having a 
definition corresponding to the first variable; and 

providing a third object module in a linker library, the 
third object module corresponding to a second state- 
ment and including code supporting the first statement. 

11. A computer-readable medium as in claim 10, wherein 
the computer program includes statements conforming to a 
non-procedural programming language. 

12. A computer-readable medium as in claim 11 having 
computer executable instructions in the computer program 
conforming to an embedded structured query language. 

13. A computer-readable medium as in claim 10, wherein 
the code to support the first statement in the third object 
module includes instructions to carry out additional tasks 
required by the first statement when implementing the 
second statement. 

14. A computer-readable medium as in claim 13, wherein 
the code to support the first statement is accessed in the third 
object module from the second object module. 

15. A computer-readable medium as in claim 11, wherein 
an additional non-procedural programming language is used 
to provide a third statement conforming to the additional 
non-procedural programming language, wherein responsive 
to detection of the third statement a second variable is 
declared in at least one source file in the computer program. 

16. A computer-readable medium as in claim 10, wherein 
the macro-preprocessor does not declare the first variable in 
the modified first source file if the macro-preprocessor docs 
not detect the first statement in the first source file whereby 
the first object module is not linked with the second object 
module. 

17. A computer- readable medium as in claim 15, wherein 
a fourth object module having a definition for the second 
variable is linked instead of the second object module. 

18. A linker library for generating an executable by a 
linker linking compiled object modules, the linker library 
comprising a first object module having a definition for a 
first variable providing a first value to the first variable and 
a declaration and definition for a second variable providing 
a second value to the second variable whereby a second 
object module can test, at runtime, the second variable for 
the second value to determine if the first object module is 
linked by the linker into the executable. 

19. The linker library of claim 18 wherein the linker 
library has only one object module. 
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20. The linker library of claim 18 further having a third 
object module after the first object module, the third object 
module having another declaration and definition for the 
second variable providing a third value to the second vari- 
able. 

21. The linker library of claim 20 wherein the third object 
module is the last object module in the linker library. 

22. A linker library comprising a last object module 
having a definition for a variable, the last object module 
following at least one object module in the linker library 
wherein the variable is declared in at least one object module 
distinct from the last object module. 



23. A macro-preprocessor for preprocessing program 
source code, the program source code comprising program- 
ming language statements such that an implementation of a 
first statement requires changing an implementation of a 
second programming language, wherein the macro -prepro- 
cessor introduces a declaration for a first variable in a 
preprocessed program source code in response to detecting 
an occurrence of the first programming statement in the 
program source code. 

24. The macro-processor of claim 23 wherein the pro- 
gramming language is a non -procedural programming lan- 
guage. 

***** 
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END USER QUERY FACILITY INCLUDING A In accordance with further teachings of this invention, the 

QUERY CONNECTIVITY DRIVER user is provided with a "natural" description of the database 

model to further ease his effort in information access. The 

CROSS-REFERENCE TO RELATED "natural" description includes classes with an entity- 

APPLICATIONS 5 relationship (E-R) model describing each class. The classes 

This application is a continuation-in-part application of "f m ™ nt lv f* s of high-level objects whose 

U.S. patent application Ser. No. 08/346,507. filed Nov. 29 T bnn f Uon m contained in * c database. The description of 

1994. (now U.S. Pat. No. 5.70M66), which is a * Cksses is nudc P° ssible b V *** query facility 

continuation-in-part application of U.S. patent appUcation . f^P^S additional appUcation semantics from the 

Ser. No. 08/154 ,343 filed Nov. 17, 1993 (now abandoned in 10 CX1StUl ? da ^ Se model t0 pf0vide a richer set of dcrivcd 

favor of continuation application Ser. No. 08/348,742. filed sem f nUc ?' ^ etched set of derived semantics is then 

Nov. 30. 1994. now U.S. Pat. No. 5,487,132, issued Jan 23 * the classes Md g CDCrate *«r definitions. 

1996), which in turn is a continuation-in-part of U.S patent . ^^° n each class is subsequently translated 

appUcation Ser. No. 07/846,522, filed Mar. 4, 1992 (now 111 R model of 1116 class ' a<UUtion mcsc class « and 

U.S. Pat No. 5,325.465. issued Jun. 28, 1994) * 15 thar ^ R models caD casU y "Pd*** b y ^e end-user 

'* query faciUry when the database model is changed. 

TECHNICAL FIELD ^ accordance with the teachings of this invention, it has 

_. . , also been determined that there would be some usefulness in 

This invention pertains to end user query technology, and providing an end-user query technology that allows a user at 

more specifically to an end user query facility which scouts 20 a remote site with no on-line access to the database to still 

for information by understanding the database model and be able to make a query. This is madepossible by integrating 

guiding the user. mc end . user qucrv fac^ty of ^ s mve ntion with an elec- 

R ACKC R OT TMT» stcm S ° mat * C USCf at rem0te site Can Sefld 

^ UKUUWU his query as a mail message and have the result of his query 

This invention pertains to end user query technology. 25 P° ste< * to him also as a mail message. In addition, a log of 
introducing a novel abroach to end user information access. < * uery rc< l ucsts «>d their processing can be kept and 

Current end user query techniques require the user to analyzed to track usage and performance of the end-user 
understand database models in order to access information. query facility. 

For example, using prior art database models in which it is „ A key benefit of an additional embodiment of the present 

nrtt nn^nmfnnn U ah. /l A ... n . Zt — ~* J— - _ _ r _ 1 j ■ ^ inv^R*-.'**.* nil..... A fi _!iir »• 

»v» ^MwwAAMMVfu uuvw uu^uo u uui iAi CJUJcaa ui it UUUUTCU u<vwuu.wu auvw9 uic xitauy i inn u nis <jl CUSUUg SppilCaQODS 

separate data base files interrelated, it is necessary for the used by many organizations such as Microsoft Excel 

database programmer to know in which file the desired piece spreadsheet. Microsoft Access DBMS. Lotus 123 

of information is located, and then appropriately connect the spreadsheet, Powersoft PowerViewer or InfoMaker. Gupta 
files to achieve an orderly access of the specific file con- 3J Quest Q+E Database Editor, etc. that are ODBC compliant 

taining the desired information. This requires a fair amount 07 compliant to other data access interface standard to be 

of skill on the part of the database programmer and intimate reused for making powerful ad-hoc queries easily through 

familiarity of that programmer with the structure of the seamless integration of a novel Query Facility provided by 

database which may be extremely complex. Furthermore, this invention, thus saving on the cost of purchasing new 
training new database programmers on an existing database „ tools to have more powerful ad-hoc queries as well as the 

model requires considerable amount of time and effort One cost of training users to use the new tools, 

example of a prior art knowledge-based information 

retrieval system is the EASY-TALK product available from BRIEF DESCRIPTION OF THE DRAWING 

^StS^^S^ C ° an \!?T^ * na 1 is aflow *** « embodiment of an end 

ts^s^jssss^sz - zujsxsst * - — M * - 

semantics of the database. ~* i * ' 

FIG, 2 is a flow chart depicting one embodiment of 

SUMMARY OF THE INVENTION semantics extractor 12 of FIG. 1; 

T „ . u . ^ t _ FIG. 3 is a flow chart depicting one embodiment of a 

in accoroance witn me teaclungs or mis invention. U has 50 method suitable for use to build basic knowledce threads 

been determined that there would be great usefulness in ste p 25 of semantics extractor 12 of FIG. 2; 

providing an end user query technology which is capable of AX fl w ^ 7\ . \ J9 

automatically understanding the database model and guiding ™*' 4 is ? , flow de P lctin 8 0DC embodiment of a 

the user to scout for the desired information, thereby mcreas- mC ?°?. smtablc for use as infonnation ™ of the 

ing productivity and ease of information access. In accor- 55 C Dt 10 HG ' 1; 

dance with the teachings of this invention, the user is freed FIG ' 5 * s a flow chart depicting one embodiment of a 

from the need to understanding the database model, with the mctno(i suitable for use as the search knowledge base step of 

end user query facility of this invention quickly guiding the the c" 10051 " 06 ^ to HG. 4; 

user to acquire the information. This is made possible by the FIGS. 6a and 6b is a flow chart depicting one method 

end user query facility of this invention first recapturing the 60 suit4 °fc for use as the infer new threads from knowledge 

application semantics from the existing database model to base step of the embodiment of FIG. 4; 

provide a set of derived semantics. The derived semantics FIG. 7 is a flow chart depicting an alternative embodiment 

are then used by the end user query facility to intelligently of this invention including a model purifier in accordance 

guide the user to scout for the desired information in the with this invention; 

database. In addition, the derived semantics can be easily 65 FIG. 8 is a flow chart depicting one embodiment of a 

updated by the end user query facility when the database semantics extractor constructed in accordance with the 

model is changed. teachings of this invention interfacing with a model purifier; 
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of a semantics extractor including a program analyrer in f^^Z^^^ to explicitly define 

accordance with the teachings of this invention; Lf!Ste *e rSLtta offce database to the end 

FIG. 10 is a flow chart depicting an alternative embodi- uscr query facility, 

ment of this invention including a security model specifier, ^ Morraation Sc out Information Scout 15 guides 

FIG. 11 is a flow chart depicting an alternative embody • ^ ^ ^ ^ ^ Uems tQ be reported in order to 

ment of this invention which includes a model purser, a ^ information he wants. This is carried out in 

security model specifier, an alternative embodiment ot a ^ ^ Report ^ Selector 16 prompts the 

semantics extractor, a class generator, and an entity- j(j uscrforte ywor<b that suggest toe desired data item, for 

relationship (E-R) model translator; example DATE. Using a look-alike keyword search 

FIG. 12 is a flow chart depicting an alternative ernbodi- method, all items in Keyword Library 13 containing 

ment of a semantics extractor to extract additional semantics DATE axe listed. The user then makes the appropriate 

in the form of types of binary relationships and entity types selection. After selecting the items, Inference Engine 

of files in order for the class generator to generate class 15 J? ident jfies the files containing the selected items, 

definitions; Inference Engine 17 then searches for the linkage(s) in 

FIGS. 13fl-l and 13a-2 form a flow chart depicting one Knowledge Base 14 connecting the identified files, 

ernbodiment of a procedure to identify kernel entities, suit- c Jhe pr 0gram Generator. Program Generator 18 

able for use in the step of identify entity type of each file in acc esses the linkages obtained by Information Scout 15 

the einbodiment of FIG. 12; 20 ^ generates the corresponding Source Program 19 to 

FIG. 136 is a flow chart depicting one embodiment of a extract the information requested by the User, 
procedure to identify characteristic and associative entities, ^ Thc Compiler/Executor. The source program is corn- 
suitable for use in the step of identify entity type of each file pawl ^ execute d against the database to generate the 
in the embodiment of FIG. 12; report using Compiler/Executor 20. 

FIG. 14 is a flow chart depicting one embodiment of a ^ A mofC detailed description of one embodiment of these 

procedure to reclassify kernel entities into pure lookup mo dules is now provided. 

entities, suitable for use in the step of reclassify certain The Semant ics Extractor m 

entities and binary relationships in the eiBbodiment of FIG. nG 2 shows a flow chart of Semantics Extractor ix. in 

n order to extract the semantics of an application, an applica- 

1 , ^.-i-* ~ «„ w rMr+ of one pmhodiment 30 t;™ evct^m must have a data dictionary that represents the 

of a^eT £^^^1^"^** ^U^n<i.U model. In *JP™*»™^~ 

of a procedure for Process JD. suitable for use wrt* the "^tllSl & parsedby Data 

embodiment of FIG. 17; and Dictionary Analyzer 21 in order to obtain the keywords and 
FIG. 19 is a flow chart depicting an alternative eirJbodi- 40 ^^^^ ; bout each file and item in Database Model 

ment of this inventions which a query facility is integrated theimonnaa 

with an E-mail system; ^ towards are derived from the item definitions in the 

FIG. 20 is a flowchart depicting one embodiment of this ^ ^ adoDaty m me f 0 Uowing manner. The name of each 

invention which includes a Knowledge Thread Analyzer; i(cm be in the form of one complete word (e.g. 

FiQ 21 is a diagram depicting the relationship between « SALARY ) or may use more than one word separated with a 

the four components of the ODBC architecture; and h hen (e . DATB-JOINED). Hyphens are removed from 

FIG 22 is a diagram showing a Query Faculty linked to ^ ite m names with hyphens by Data ^ction^A^yzer 

an application using a Query Facility Connectivity Driver, in 21 rcsu lting individual words obtained from the item 

accordance with one einbodiment of this invention. names are then stored as keywardsin Keyword Library 13. 

ez SALARY. DATE, and JOINED. These keywords are 

DETAILED DESCRIPTION latw by Report fem Selector 16 (FIG. 1). 

Overview M 1 1 .. Next. Knowledge Base 14 is built The first step involves 

The following briefly describes one embodiment ot tms the following information from each file in Data- 
invention which is in the reporting language known as ^ W; 
QUIZ. The language is part of a fourth generation language sj 
toown as POTOJRHOUSB from COGNOS Incorporated of a. File name 
Canada. However, it is to be understood that the teachings b. For every item in each file 
of tMsinvcntionareequally W^letoany numcric . date) 
technology, including languages other than QUE. ^ * em ly ? c 16 

n^° U o\? i mciufes OPerati0n * ^ " S Kitem is a key. its key type (e.g. unique key, 

"ratios tractor. Semantics Extractor 12 reads "g^to J^DeStion List 22 which is men 

Database Model » * - ^ £ „JK£^J«d-o^ using the step Derive 
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with other files. A binary relationship consists of two files 
that can be linked. Two files can be linked if: 1 

(i) both files have one item with the same name, AND 

(ii) the item in the target (second) file is a unique or 
repeating key. 

For example, in a Personnel Information System, the 
EMPLOYEES file would have an item called Emp-no which 
is a unique key to identify each employee and the SKILLS 
file would also have an item called Emjvno. Each record in 
a SKILLS file would contain a particular skill an employee 
has. As a single employee would have many different skills 
the Emp-no would be repeated for each skill and so would 
be a repeating key. These two files would form a binary 
relationship since Emp-no is the common item and Emp-no 
in SKILLS file is a repeating key. The item Emp-no in the 
source (first) file which in mis example is the EMPLOYEES 
file need not be a key. 

Often two items could be coded differently but mean the 
same thing, for example, both P-NO and PAKT-NO could be 
used to represent a part number in a Inventory Control 
System. P-NO may occur in some files and PAKT-NO in 
others. These two items are said to be in the same "domain" 
called P-NO as shown below: 



a. Source file 

b. Source file item to link to target file 

c. Target file 

d. Target file item to link to source file 

e. Relationship (e.g. UU,URJ^U,NUJsrR) 

From the earlier example using EMPLOYEES and 
SKILLS files, the binary relationship are stored as: 



Source file 


Item 


Target file 


Item 


Relationship 


EMPLOYEES 


Emp-no 


SKILLS 


Emp-no 


UR 



File name 


Item name 


* Domain 




A -PARTS-FILE 


P-NO 


P-NO 




B-PAJCrS-FILE 


PAKT-NO 


P-NO 





In such a case, we would not be able to establish a file 
relationship between the two files as the item names are 
different But it would still be meaningful to establish a link 
between the two files with such items as the items are in the 
same domain, 

In the example above, both A-PARTS-FILE and 
B-PAKTS FILE have items in the same domain called P-NO 
though the item names are different A-PAKTS-FUJE and 
B-PARTS-FILE should therefore be linked. The linkage rule 
defined earlier is thus modified as follows to take into 
account items with not only the same name but with different 
names in the same domain; 

(i) both files have one item in the same domain, AND 

(ii) the item in the target (second) file is a unique or 
repeating key. 

For every linkage between two files, the item in the source 
file can be a unique key, repeating key or non-key while the 
target file must either be a unique or repeating key. From this 
restriction we can derive six possible valid types of file 
linkages. These are as follows: 



The set of binary relationships derived from the above 
15 step is stored in a Binary Relationship File 24. The next step, 
namely Build Basic Knowledge Threads 25 * involves deriv- 
ing knowledge of Database Model 10 of an application 
system from these binary relationships which is then stored 
in Knowledge Base 14 in the form of knowledge threads. 
20 Each thread represents a linkage of many files. The follow- 
ing is an example of a knowledge thread: 

EMPLOYEES ->BRANCHES->EXPENSES 

It contains an EMPLOYEES file linked to BRANCHES 
25 file which is then linked to EXPENSES file. Inference 
Engine 17 (FIG. 1) in Information Scout 15 lata uses these 
threads in isolation or combination to infer the access paths 
in order to obtain the information requested by the user. For 
example, when the user wants to report the employees and 
their expenses in each branch, the above thread is used to 
generate ihc required pain io navigate through an application 
database in order to acquire the required information. 

Knowledge Base 14 is made up of basic and acquired 
knowledge threads. The following describes the derivation 
of basic knowledge threads by joining binary relationships 
from Binary Relationship File 24. The derivation of acquired 
knowledge threads is described later. 

FIG. 3 is a flow chart of one method suitable for use as 
Build Basic Knowledge Threads 25 of FIG. 2. The following 
describes how a single basic knowledge thread is built using 
the method of FIG. 3, with reference to the following 
example of a procedure Find-Thread: 
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Source file 


Target file 


Notation 


a. Unique key 


Unique key 


UU 


b. Unique key 


Repeating key 


UR 


c. Repeating key 


Unique key 


RU 


<L Repeating key 


Repeating key 


RR 


e. Non-Key 


Unique key 


NU 


t Non-Key 


Repeating key 


NR 



However, in one embodiment the repeating to repeating 
(RR) combination which is item d above is not stored 
because this represents a bad file design. A repeating to 
repeating relationship indicates a many to many relationship 
which preferably should not exist in a normalized data 
model. The analyst is informed of such a finding and attempt 
to rectify it 

All the binary relationships found using the above rules 
are stored as follows: 



Procedure Find-Thread (file-count, pointer 2) 
begin 

oW-fib-count = file -count 
while (pointer 2 O end of file) 
begin 

read binary relationship file using poxnter 2 
until (thread-end = source fife of binary relation 
record) 

or (end of binary relationship file) 
if (binary relation record exists) and 

(target-file of record not exist in current-thread) 
then begin 

call Procedure Fuad-TTmra^nle-count pointer 2) 
file-count = file-count + 1 
if (max -count < file-count) 
then max -count = file-count 
current-thread - current- thread + target-file 
reset pointer 2 to start of binary relationship 
fileendif 
esdwbile 

if file-count o old-file-oount 
then write current -thread into Bask Knowledge file 
end 

Note: Parameters file-count and pointer 2 are passed by value Current-thread, 
max -count are global variables 

65 The formation of each thread begins with a first binary 
relationship record read from Binary Relationship File 24. 
This record forms an initial thread. However, if the first 
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binary relationship record is of type W (repeating to 
unique) relationship, it is ignored and no initial thread is 
formed from it Hie next record is then read and if it still is 
of type 'RIT. it is again ignored until the next record is 
found that is not of type <RU\ which is then used to form 
the initial thread. The reason for doing this is that it is only 
necessary to form threads using either 'RIT or W types 
and Dot both as they lead to the same access path being 
inferred by Inference Engine 17. In this embodiment we are 
using W types. To keep track of the next record to read 
from Binary Relationship File 24, a pointer called pointer 1 
is used. 

The initial thread can be extended by linking with other 
valid binary relationship records. To do so, Binary Relation- 
ship File is searched. This time another pointer, namely 
pointer 2. is used to keep track of the next record to read 
from Binary Relationship File 24 to link to the thread to be 
extended. A link to extend the thread is said to be found if 
the first file of the binary relationship record and the file at 
the end of the thread, called the thread-end file, are the same, 
or else the next binary relationship record is read using 
pointer 2 If a valid binary relationship record is found, we 
then examine if the target (second) file of this record is 
already in the thread. If it is not, it is added to the end of the 
thread, to become the threaded. If it is. the next binary 
relationship record is read using pointer 2. The search ends 
when the end of Binary Relationship File 24 is reached using 



The method of this example is then as follows: 



Pointer 1 at Thread Formed 

1st record of Use the first binary relationship 

SJ^fik record EMPLOYEES link to BRANCHES as 

the valid binary relationship record to 
form tlr initial thread Neit, examine 
whether this thread can be extended by 
searching through the same Binary 
Relationship File, this time using 
pointer 2. It can be linked to the 
second binary relationship record, 
namely BRANCHES link to EXPENSES 
the final thread: 
EMPLOYEES -> BRANCHES -+ EXPENSES 
0 _j -^-i Similarly, using the next binary 

** «Uuo nship BRANCHES link to EXPENSES 

record, form the initial thread: 
BRANCHES -> EXPENSES 
This thread cannot be extended as the 
EXPENSES file cannot be naked to other 
binary relationship records. 
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form 



3rd record 
Using the 3rd 
binary 

relationship the 
third thread 
EMPLOYEES-* 
BILUNOS-y 
PROJECTS is 
formed as 
follows: 

EMPLOYEES links 
to BILUNOS via 
Emp-oo 



25 



rhen trie ena or nwaiy iwumviww**' » — - — ~ ejup-uu 

™«tpr * The formation of a single thread is then complete 30 billings u» 

_ f . lrtl _. c . to PROUaLTls vi 



and is stored in Knowledge Base 14 as follows 

Thread-Head: 
file name 

file item to link to next file 33 
Thread-Body: 
file name 

file item to link to the previous file 
relationship to the previous file 
file item to link to the next file 40 
Thread-End: 
file name 

file item to link to the previous file 
relationship to the previous file 45 
The thrcad-hcad contains the first file and the first file 
item. It is linked to the next file using the thread-body or 
thread-end. The thread-body is made up of a file, its item 
which has the same domain as the item of the previous file 
it is linked to and the relationship to that file. It also contains M 
another item which is used to link to the next file. The 
thread-end is similar to the thread-body except mat it does 
not have the item to link to the next file. 

The following example briefly shows how Knowledge 
Base 14 is built up with basic knowledge threads using the 5 5 
above procedure. Assume that Binary Relationship Fde 24 
contains the following records: 



to PROJfcU lS via 
Proj-no 

This thread uses 
different items, 
namely 

Emp-oo and Proj- 
no to link the 
two binary 



records 
4th record 



Source file 



Item Targret file 



Item 



Relationship 



1. EMPLOYEES 

2. BRANCHES 

3. EMPLOYEES 

4. BILLINGS 

5. PROJECTS 

6. BILLINGS 



Br-no BRANCHES Br-no 

Br-no EXPENSES Br-no 

Emp-oo BILUNOS Emp-oo 

Emp-oo EMPLOYEES Emp-no 

Proj-no BILLINGS Proj-«> 

Ptoj-no PROJECTS Proj-no 



NU 
UR 
UR 
RU 
UR 
RU 
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No thread b formed a* relatioaship is 
W. 

c rh -cc-d Thread formed is 

Sthrecorf PROJECTS -* BILLINGS — ► EMPLOYEES as 

follows. 

PROJECTS links to BILLINGS via Proj-no 
BILLINGS lints to EMPLOYEES via Emp-oo 
This thread uses different items, namely 
Pioj-no and Emp-no, to link the two 
binary relationship recoids. 
6tb record No thread is formed as relationship is 

W. 

Each thread is stored in Knowledge Base 14 as follow*, 
using the third thread above as example: 
Thread-Head fik = EMPLOYEES 

item = Emp-no 
Thread-Body file = BILLINGS 
item- 1 = Emp-no 
relationship = UR 

item- 2 = Proj-no 

Thread-End file * PROJECTS 
item — Proj-no 

relationship = RU — - 

^ ft S chart of one method suitable for use as 
Information Scout 15 of FIG. 1. *W^*^J^1 
guides the user to select the items to be reported- Inference 
Engine 17 then infers the access path based on the items 
selected. 

RcDort Item Selector , _ 

The user is prompted for a keyword or presented with a 
list of all the words in Keyword Library 13 to choose from 
When the user chooses to provide a keyword. Report Item 
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Selector 16 lists all the items that match the keyword 
provided. For example, the keyword DATE could return the 
following list of items containing the word DATE 

DATE BIRTH 

DATE JOINED 

DELIVERY DATE 

LAST DATE UPDATE 

The user then selects the desired item from the list. In one 
embodiment, an explanation of each item is also displayed. 
This explanation is either extracted from the data dictionary 
of an application system by Data Dictionary Analyzer 21 or 
entered by a programmer analyst maintaining the application 
system. 

Using the above method, the user selects all the items to 
be reported. For each item, the file containing it is automati- 15 
caily identified and added to the file list using Generate File 
List Step 27. When two or more items are selected from the 
same file, only one entry is made in the file list This file list 
is sorted such that the first file is the file which has the 
highest number of items selected and the other files are in 
descending order of items selected. When there is only one 
file in the Ust (that is, all the items selected come from the 
same file), no search of access paths is required. Otherwise, 
inference engine 17 is invoked to infer the access path. 
Inference Engine 

The first step involves searching for the optimal knowl- 
edge thread in Knowledge Base 14 to be used to generate an 
access path. If no optimal knowledge thread is found, the 
next step is to infer new knowledge threads, one of which is 
used to generate an access path (See FIG. 4). However, if no 
new threads can be inferred, then the user is informed that 
there is no solution. 

Knowledge base 14 comprises two sections: basic and 
acquired knowledge threads. The basic knowledge section 
contains the knowledge threads that are generated by 
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(EMPLOYEES -^BRANCHES) and is equal to the number 
of files in the file list This thread is considered an "optimal 
thread" as all the files in the file list exist contiguously on the 
thread starting from the thread-head. Thread 2 and 3 are 
invalid because BRANCHES is not found on the thread. 

When an optimal thread is found, the search ceases and 
the optimal thread is used to generate the access path. In the 
above example. the thread 

EMPLOYEES ^BRANCHES -^EXPENSES is used to gen- 
erate the access path but only up to the BRANCHES file 
with the EXPENSES file ignored or "trimmed our as it is 
not required in this example of the user query. Otherwise, 
Inference Engine 17 attempts to infer new threads by joining 
two or more threads together in parallel (see FIG. 4). Those 
new threads that are found to be optimal, i.e. they have the 
number of files accessed equal to the number of files in the 
file list from step 27 of FIG. 4. are then classified as acquired 
knowledge threads and stored in the acquired knowledge 
section of Knowledge Base 14. 

The following section describes the process of deriving 
acquired knowledge threads. Before inferring new threads. 
Inference Engine 17 must first generate a list of knowledge 
threads consisting of basic knowledge threads which have at 
least one file matching any file in the file list from step 27 
of FIG. 4. This thread list is then sorted in descending order 
of the number of files in each thread matching those in the 
file list Within this sorted list of knowledge threads, if there 
are more than one thread with the same number of files 
matching those in the file list, the threads are then sorted in 
ascending order of the number of files in each thread. This 
results in a knowledge thread mat has the most number of 
files matching those in the file list bui has the ieast number 
of files on the thread being at the top of the list. This forms 
the sorted thread-list. This list is then used to form new 
threads by joining in parallel two or more threads. FIGS. 6<* 
and 6b form a flow chart of how these new threads are 
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also contains knowledge threads but these are knowledge 
threads that have been inferred by Inference Engine 17. The 
process of acquiring these knowledge threads is explained 
later. 

In this embodiment, the search is made on the acquired 
knowledge section first. If a thread is found where all the 
files in the file list using step 27 of FIG. 4 exists on the 
thread, this thread used to generate the access path. When 
the search is unsuccessful, i.e. no acquired knowledge thread 
is found which can match all the files in the file list, the 
search proceeds using the basic knowledge section. FIG. 5 45 
shows a flow chart of one method of performing the search 
for a knowledge thread from the basic knowledge section to 
be used to generate an access path. 

The following example illustrates this search. Assume the 
file list from Step 27 in FIG. 4 has been build up from the 
items the user has selected and it contains two files as 
follows: 

FUe list: 
EMPLOYEES 
BRANCHES 

Assume that the basic knowledge section has the follow- 
ing three basic knowledge threads with EMPLOYEES as the 
thread-head: 

Thread 1: EMPLOYEES— ^BRANCHES— ^EXPENSES 

Thread 2: EMPLOYEES—) PAY 

Thread 3: EMPLOYEES -^BILLINGS 

We define the number of files accessed as the number of 
files in the thread which exists in the file list starting from the 
thread head. 

Thread 1 has files which match all those in the file list, 
namely EMPLOYEES and BRANCHES. The number of 
files accessed in this case is two 



For ease of explanation of how the above procedure 
works, assume that the file list from step 27 of FIG. 4 
contains the following three files: 
EMPLOYEES 
BRANCHES 
PAY 

and the basic knowledge section of the Knowledge Base 14 
contains the following three threads as before: 

Thread 1: EMPLOYEES-BRANCHES -^EXPENSES 

Thread 2: EMPLOYEES ->PAY 

Thread 3: EMPLOYEES— ^BILLINGS 

Thus there is no basic thread that contains all three files 
in the file list Inference Engine 17 next employs parallel 
join inferencing. The basic knowledge threads which have at 
least one file found in the file list is extracted and sorted as 
follows: 



55 



SO 





No. of 






fifes 






to thread 












tboac in 


Thrcad- 


Thread-Hat 


file list 




1. EMPLOYERS PAY 


2 


2 


2, EMPLOYEES -* BRANCHES -+ EXPENSES 


2 


3 


3. EMPLOYEES ~+ BILLINGS 


1 


2 



The flow chart of FIGS, 6a and 6b is then applied. First 
65 the thread EMPLOYEES ->PAY is added to the parallel Ust 
Next, the file EMPLOYEES is read from the file lisL Since 
EMPLOYEES exists on this thread it is removed from the 
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file list The next file from the file list is then read and 
examined whether it exists in the same knowledge thread. 
Since it does, it is also removed. However, the next file from 
the file list namely BRANCHES, does not exist in the 
knowledge thread. It therefore remains in the file list The 
next step is to retrieve from Knowledge Base 14 the basic 
knowledge threads whose thread head (first file) is 
EMPLOYEES. ^ . 

As stated earlier, the Knowledge Base contains the fol- 
lowing basic knowledge threads: 

1. EMPLOYEES— »BRANCHES->EXPENSES 

2. EMPLOYEES— ►PAY 

3. EMPLOYEES ^BILLINGS 

Thc basic thread 

EMPLOYEES— »BRANCHES-»EXPENSES has the 
BRANCHES file and is therefore added to the parallel list 
But before it is added, the EXPENSES file is removed as it 
is not a file in the file list. The parallel list now contains the 
following threads: 
Parallel list: 

EMPLOYEES— ^PAY (obtained from the sorted thread 

list) . 
EMPLOYEES -^BRANCHES (obtained from the basic 

knowledge threads) 
The above knowledge threads have a parallel relationship 
through the common file EMPLOYEES and form the par- 
allel thread as follows: 



example, the method is able to infer the f oUowing parallel 
relationships, whereby EMPLOYEES and PAY are the two 
common files: 

EMPLOYEES ->PAY->filel 

EMPLOYEES ->PAY->file2 
These then form the parallel thread: 



EMPLOYEES -» PAY -» 6Jcl 



Program Generator . 

Based on the path inferred by Inference Engine 17, the 
corresponding QUE ACCESS statement is generated In 
'QUIZ 1 , the file linkage is specified using the ACLr£>:> 
' statement with the following syntax: 



ACCESS file [ LINK item OF file TO item OF file ] 
[ [ { AND > item OF file TO item OF 6k ] . . ] 

.y> { LINK } 

where ACCESS, LINK, AND, OF, TO are part of the 'QUIZ 

file refers to the file name, 

Hem refers to the item name in the file to be linked, 
i ] [Beans optional statement, 

{ } means choose one of the options i.e. AND or LINK, 
, . means repeats one or more times 
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EMPLOYEES 



— TpAY 
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Next the optiraality test is applied. As the number of files 
accessed (which is earlier defined as the number of files in 
the thread starting from the thread-head) is equal to the 
number of files in the file list an optimal solution has been 
found This optimal thread is then added to Knowledge Base 
14 as an acquired knowledge thread. However, if the opti- 
mality test fails, the above process to search for new parallel ^ 
relationships is then repeated using the next thread on the 
sorted thread-list If there are any new parallel relationships 
found, the optimality test is again applied. 

In the event the sorted thread-list has been exhausted with 
no optimal acquired knowledge thread formed from the 45 
parallel relationships found, the user is prompted to select 
one of the parallel relationships found if there are more than 
one, or else the single parallel relationship found is used to 
generate the access path. 

In cases where there are more than one parallel ^ 
relationships, there may exist parallel relationships as fol- 
lows: 



In QUIZ, there are two ways of defining a linkage 
between a number of files: hierarchicjU and] parallel A 
so hierarchical linkage is defined with the LINK . . . 1U 

. r ^ _ <*»t~m/»nt A nam 11 el linkage IS 

defined with the "AND ... TO" option of the access 
statement ^ tU 

When a single thread is used to generate the access path, 
the hierarchical link is used. When a combination of two or 
more threads are used a parallel link is used. From the earlier 
examples. 



EMPLOYEES BRANCHES 

uses the hierarchical linkage to generate 

ACCESS EMPLOYEES UNK Br-ao OF EMPLOYEES 

TO Br-ao OF BRANCHES 
EMPLOYEES -» BRANCHES 
-►PAY 

uses the parallel linkage to generate 

ACCESS EMPLOYEES UNK Br-no OP EMPLOYEES 
TO Bt-do OF BRANCHES 
AND Emp-oo OP EMPLOYEES 
TO Emp-oo OP PAY 



a. EMPLOYEE 

b. EMPLOYEE 



► PAY 

* BRANCHES 

♦ BRANCHES 
+ PAY 
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As both these parallel relationships are semanticaUy the 
same, one is redundant and is thus removed. 

There may also be a case whereby mere are no parallel 
relationships found after the sorted thread list has been 
exhausted. In such a case, there is no solution to the end-user 
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It should be noted that for a different language 
implementation. Inference Engine 17 and Knowledge Base 
14 design need not change. Only the access path needs to be 
rewritten using the designated language. 
Compiler/Executor 

After Source Program 19 is generated, Compiler/Executor 
20 is used to compile Source Program 19 into executable 
code The compiled program is then executed to produce the 
report In this embodiment, Compiler/Executor 20 is the 
QUIZ part of POWERHOUSE fourth generation language. 
ALTERNATIVE EMBODIMENTS 

Several additional embodiments are also taught, as will 
now be described. 

In one embodiment, a Model Purifier 26 is included which 
allows a user to add or alter the key type and binary 



SLSfr using m£ than one flic as common file, For Knowledge Base 14 accordxngly. 
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In another embodiment the functionality of Knowledge 
Base 14 is extended in order to store items that are derived 
from items in the database model. In this embodiment the 
Model Purifier allows a user to input the specifications for 
the derived items. Derived items can also be obtained from 
source code of applications programs that access the 
database, in which case Semantics Extractor 12 serves to 
extract the derived item specifications from these programs. 
These derived items and their related database items 
together form a pseudo database file. Semantics Extractor 12 
then uses this pseudo file to derive new binary relationships 
with normal database files and build new basic knowledge 
threads. The new binary relationships and new knowledge 
threads are (hen stored in Knowledge Base 14. In this 
embodiment, the functionality of Program Generator 18 is 
also extended so that after Inference Engine 17 has gener- 
ated the access path which may contain pseudo database 
files as well as normal database files. Program Generator 18 
uses this access path to generate source programs to obtain 
information from the normal database files and the pseudo 
database files. 

In another embodiment* another component called the 
Security Model Specifier 29 is included to allow a user to 
input a security model which specifies the items of the 
database model that the user can access. This is called item 
security. The security model also specifies the range of 
values within an item that the user can access. This is called 
value security. To support this security, the functionality of 
Knowledge Base 14 is extended to store the security model. 
Functionality of Report Item Selector 16 and Program 
Generator IS arc also extended to use the security model in 
the Knowledge Base so that the information obtained from 
the database at query time meets the security model speci- 
fication. 

In another embodiment, the functionality of the Model 
Purifier is extended to allow a user to specify multiple 
domains for a data item and the aliases for a file containing 
this item. In this embodiment, the functionality of Program 
Generator 18 is also extended to generate the appropriate file 
alias statements in the source program to access the database 
to satisfy the user query. 

FIG. 7 depicts one embodiment of this invention in which 
Model Purifier 26 is used. Model Purifier 26 serves to allow 
a user to add or alter the key type of items in Knowledge 
Base 14, e.g. the key type of an item can be changed from 
unique to repeating key. Model Purifier 26 is also used to 
allow a user to alter the binary relationships between data- 
base files located within Knowledge Base 14. 

The rationale for the use of Model Purifier 26 in accor- 
dance with this embodiment of the present invention is that 
in some applications the database model or the application 
programs that access the database may not be rich enough 
for Semantics Extractor 12 to extract the necessary seman- 
tics including file linkages for the user to perform certain 
queries. Model Purifier 26 allow the user to input the 
additional semantics to satisfy these queries. 

FIG. 8 shows one embodiment of this invention in which 
Semantics Extractor 12 is extended to interface with Model 
Purifier 26. If the key types are altered. Model Purifier 26 
activates Semantics Extractor 26 to re-derive the binary 
relationships and to rebuild the knowledge threads in 
Knowledge Base 14. If the binary relationships are altered. 
Model Purifier 26 activates Semantics Extractor 14 to 
rebuild the knowledge threads. 

A user may specify an item or items to be reported that 
may not be found in Database Model 10. Examples of such 
items are as follows, which we called derived items as they 
are obtained by defining using the items of the database files: 







Derived Item 


Defined as 




\ l ) 


Employee-Name 


Firstname of Employee file +• 


5 






lasframr of Employee fik 




Note: 


Firstname & t Jtmm 
Employee 


le are items of database fik 




C") 


SaleV-Cnmmissinp 


Sales-Amount of Invoice file x 
Commission Rate of Commissioa 
Table fik 


10 


Note: 


Sales- Accoimt and Cocnmission Rate are items of 
database file Invoice and Commission-Table, 






respectively 






<iii) 


Total -Sales 


sum of Saks-AmounI of Invoice 
fik for each mourn of me year 




Civ) 


Ratio-of- Jan-Sales 


(Sales-Amount of Invoice fik 


15 




to-TotaUSak* 


for JanuaiyHTotal-Saks 






derived from (iii) above) 
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These derived items can be obtained from direct user 
input into Knowledge Base 14 or from Source Code of 
Application Programs 27 that access the database. To obtain 
such derived items from direct user input the functionality 
of Model Purifier 26 is extended to meet this requirement To 
obtain such derived items from Applications Programs 27. 
the functionality of Semantics Extractor 12 is extended with 
Program Analyzer 28 (see FIG. 9) to extract the derived 
items from these programs. In addition to the definitions of 
the derived items, the data type, size, and format of these 
derived items are also extracted from Applications Programs 
27 by Program Analyzer 28 of Semantics Extractor 12. 
These derived items and their related database items 
M together form a pseudo database file which is included in 
File Definition List 22 (see FIG. 9). Semantics Extractor 12 
then uses this pseudo hie in File Definition list 22 to derive 
new binary relationships with normal database files in File 
Definition Ust 22 and build new basic knowledge threads. 
The new binary relationships and new knowledge threads 
are stared in Knowledge Base 14. Besides the Model Purifier 
26 and Semantics Extractor 12, the Program Generator 18 is 
also extended so that after the Inference Engine has gener- 
ated the access path which may contain pseudo database 
files as well as normal database files, the Program Generator 
18 uses this access path to generate source programs to 
40 obtain information from the normal database hies and 
pseudo database files. 

The following is a description of one embodiment of an 
algorithm suitable for use by Program Analyzer 28 of 
Semantic Extractor 12 to extract the derived items in accor- 
dance with the teachings of this invention: 

a. From a single pass application program 
A single pass application program is a program that 

contains only one database access statement 
Step 1: Extract the database files that are accessed in the 
application program. 
For each database file accessed 
Extract the list of items of database file. 
Store this list and the database file name as a pseudo 

database file in the knowledge base. 
Next database file 
Step 2: Extract the list of derived items and their defini- 
tions from the application program 
For each derived item 
Scan far the data type, size and format Store the 
derived item name, its definition, data type, size 
and format in the same pseudo database file 
obtained from step 1 above. 
Next derived item 

b. From a multiple pass application program 
A multiple pass application program consists of many 

single pass application program "stringed** together with the 
output of the previous pass being used as input by the current 
pass. 
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Step 1: Create a new pseudo database file 1 for the first 
pass (i=l) using the steps of the single pass application 
program above. 
Step 2: For each of the remaining passes Le. i=2 to n with 
n being the last pass. 
Create a pseudo database file i 
Store the location of the previous database pseudo file 
(i-L) in pseudo database file i. 
These derived items are then presented to the user 
together with the items from the database model for the user 
to select from in order to satisfy the user query. 

In order to generate a program from the access path 
inferred by Inference Engine 17 that may contain pseudo 
database files, in addition to the above described extension 
of the functionality of Model Purifier 26 and Semantics 
Extractor 12, the functionality of Program Generator 18 is 
also extended as follows: 

Step 1: Examine the access path generated by the Inference 
Engine 17. 

Step 2: For every file in the access path: 20 
If it is a pseudo database file. 
Then 

(i) generate source program to produce data for the 
pseudo database file using the derived item defini- 
tions; and 

(ii) generate source program to extract the information 
from both the database and the data of the pseudo 
database file produced by the program of step 2 (i) 
above. 

Next file 

In another embodiment a security model specifies the 
items a user can access (item security) and the range of 
values within an item the user can access (value security). In 
item security, a user is assigned access rights to a subset of 
the list of items available in Knowledge Base 14. In value 
security, a user is assigned access rights to a range or ranges 
of values within an item in Knowledge Base 14. For 
example, the access rights could be DEPT-NO101, SAL- 
ARY <2000, or SALARY of PAY file if EMP-GRADE of 
EMPLOYEE file >8. l jf 

FIG. 10 shows the Main Flowchart of one embodiment 01 
the present invention which supports a security model As 
shown in FIG. 10, Security Model Specifier 29 is provided 
for the user to input the security model into Knowledge Base 
14. Also, in accordance with mis embodiment, the function- 
ality of Report Item Selector 16 and Generate File List 27 
module (FIG. 4) of Information Scout 15, as well as Program 
Generator 18 are extended to support the security model as 
is now explained. 

In one embodiment, the functionality of Report Item 
Selector 16 is extended such that only the items defined in 
the security model that are accessible by the user will be 
presented to the user for selection at query time. 

In one embodiment the functionality of Generate File 
List module 27 is extended to perform the following; 
Step 1: 

For each query item selected Retrieve the file(s) contain- 
ing the query item. 
Next query item 
Step 2: 

For each file retrieved in step 1 
For each item in the file 

Retrieve any value security defined 
Next item 
Next file 
Step 3: 

For each value security retrieved in steps 1 and 2 



If the security definition involves value from another 
file 

Then add the file to the file list for inferring the access 
path 

Next value security 
In one embodiment of this invention, the functionality of 
Program Generator 18 is extended to perform the foUowing: 
Step 1: 

For each query item selected 
Retrieve any value security defined on the item. 
Retrieve the file(s) containing the query item- 
Next query item 
Step 2: 

For each file retrieved in step 1 
For each item in the file 

Retrieve any value security defined 
Next item 
Next file 
Step 3: 

For each value security retrieved in step 2 

Join the value security definition using the AND con- 
dition. 
Next value security 

In yet another embodiment of this invention, the func- 
tionality of the Model Purifier is extended to specify mul- 
tiple domains for a data item. An example of an item with 
multiple domains is Code Id of a database file called Master 
Code which may contain 2 items, namely Code Id and Code 
Description. The data for the Master Code file could, by way 
of example, be as follows: 



Code Id 



Code Description 



35 



RACB01 
RACE02 



cmroi 

CITY02 



Chinese 
Caucasian 



San Francisco 
Singapore 



40 



In this example, this file is used to store codes for races 
and cities. However, races and cities are two separate 
domains. To break the codes into the two separate domains, 
the Code Id must be redefine as Race Code and City Code. 
The Master Code file will then have the following file 
aliases: 

Race Master Code to contain Race Code Id 
City Master Code to contain City Code Id 
To support this multiple domain far an item, we need to 
extend the Model Purifier to allow a user to specify the 
multiple domains and the corresponding file aliases. FIGS. 
12 and 13 depict one embodiment of a main flowchart and 
Semantics Extractor 12 of the present invention which are 
capable of supporting multiple domains. The redefined 
items, eg Race Code Id and City Code Id, are stored in 
Keyword Library 13 and the alias file(s) are stored in 
Knowledge Base 14. In this embodiment, the functionality 
60 of Program Generator 18 is extended to identify the alias 
filc(s) in the access path generated by Inference Engine 17 
and to generate the necessary alias file(s) statements in the 
source program . « , * 

There are also a number of alternative embodiments of 
65 this invention related to how the Knowledge Base is physi- 
cally implemented. The various forms of implementations 
can be considered in the foUowing manner: 



45 



50 



55 



02/07/2004, EAST Version: 1.4.1 



5,749,079 



17 

a. Precreated versus Run-time Creation 

b. Permanent Storage versus Temporary Storage 

c. Outside the Database Model versus Inside the Database 
Model 

Id precreated mode, the Knowledge Base is created once 
and used for every query over a period of time. Should the 
Database Model change, then the Knowledge Base is rec- 
reated to properly reflect the changed Database Model. 

In run-time mode, the Knowledge Base is created at query 
time for every query, regardless of whether the Database 
Model has or has not changed since a previous query. 

In permanent storage, the Knowledge Base is usually 
implemented in secondary storage devices such as a disk 
drives. In temporary storage the Knowledge Base is imple- 
mented in memory such as RAM (random access memory). 
A Precreated Knowledge Base is preferably (although not 
necessarily) implemented in permanent storage while a 
Run-Time Knowledge Base is preferably (although not 
necessarily) implemented in temporary storage. The reason 
is that as precreated knowledge is created once and reused 
many times, it would be beneficial to keep it permanent so 
that even if the electrical power to the computer system is 
switched off. the Knowledge Base is retained and can be 
used again once power is restored. In run-time mode, as the 
Knowledge Base is created at query time and not reused for 
the next query, it is not necessary to implement the Knowl- 
edge Base in permanent storage. 

In POWERHOUSE fourth generation language, the data- 
base model is implemented in the data dictionary. In SQL 
database language, the database model is implemented in the 
system catalog. 

In one embodiment of the present invention, the Knowl- 
edge Base is implemented outside the data dictionary or 
system catalog. In an alternative embodiment of the present 
invention, the Knowledge Base Is implemented to reside 
partially or fully inside the data dictionary or system catalog, 
as the data dictionary or system catalog is a form of storage 
that can contain the Knowledge Base. 
GENERATION OF CLASSES AND THEIR TRANSLA- 
TION INTO ENTTTY-RELAnONSHIP MODELS 

An additional embodiment is taught as will now be 
described. The purpose of this embodiment is to provide 
another way for a user to interface with Report Item Selector 
16 to formulate this query. Instead of a long list of items far 
a user to pick as described earlier the user is initially 
prompted for a class to pick out of a number of classes that 
make up his application. An entity-relationship (E-R) model 
of the selected class is then presented to the user for 
selection of the desired attributes. The desired attributes then 
constitute a designation of the information to be extracted 
from the database. It is also possible for the user to formulate 
the query by selecting the desired attributes from two 
different classes. 

This invention introduces a way of selectively grouping 
the database files into what we called classes and translating 
each class definition into an E-R model. These classes 
represent the different types of high-level objects that make 
up the application, e.g. employees, customers, orders, 
invoices, etc. 

This embodiment, as shown in FIG. 11, involves the 
following: 

the extension of Semantics Extractor 12 to extract more 

semantics from Database Model 10 
the addition of Class Generator 30 
the addition of E-R Model Translator 31 
the addition of E-R Model of Classes File 32 
Extension of Semantics Extractor 12 
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FIG. 12 is a flowchart of one embodiment of this inven- 
tion utilizing Semantics Extractor 12 which is extended to 
support the extraction of additional semantics from the 
Database Model 10. There are several additional parts to mis 
5 embodiment. 

Classify Binary Relationships (BR) 

Classify Binary Relationships step 33 classifies each 
binary relationship in Binary Relationships File 24 into one 
of the following types, in this embodiment: 
io has_children 

has_wards 

has_subtype 

We shall now describe these three different types of binary 
relationships and explain how they can be identified from 
15 the key types of the items of the database files. 

Earlier we have described six kinds of file linkages, 
namely 

UR (unique key to repeating key) 

RU (repeating key to unique key) 
20 NU (non-key to unique key) 

UU (unique key to unique key) 

RR (Repeating key to repeating key) 

NR (non-key to repeating key) 
^ Of these six kinds, both the RR and NR combination 
should not exist in a normalized data model as they represent 
a bad file design. 

Let us now consider the remaining four kinds of file 
linkages. For the UR kind, there are two possible types of 
binary relationships that can exist, namely 

'-""■" r " 

has_wards 

Let us explain the meaning of these two types with 
examples. Suppose we have an EMPLOYEE file with 
Emp __No as a unique key and another file called SKILLS to 
35 contain the skills of every employee. The SKILLS file has a 
repeating key item called Emp__No and a non-key item 
called Skill but no unique key. The binary relationship 
between these two files is as follows: 



40 



SoujceFde 


Item 


Target File 


Kern 


Relationship 


EMPLOYEE 


Rmp __Nf> 


SKILLS 


Emp_No 


UR 



45 This binary relationships is a **has_children n type 
because EMPLOYEE is not only related to SKILLS but the 
relationship is one where EMPLOYEE considers SKILLS as 
its "children". This is because the records of SKILLS can 
only be created if their corresponding EMPLOYEE record 

^ exists. This t< has_children" type of UR binary relationship 
can be identified by the fact that the source file of the binary 
relationship has a unique key but the target file has a 
repeating key but no unique key. 

Let us consider another example involving 3 files as 
follows: 

55 



File 


Unique Key 


Repeating Key 


CHAPTER 


ChajOto 




SECTION 


Chap_ No, Sect-Ko 


ChapuJto 


PARAGRAPH 


Chap_No, SecOIo, Para_DO 


Chap__No, Sect_No 



In this case CHAPTER has a unique key called Chap_No 
and no repeating key. SECTION has a composite unique key 
and a repeating key within the composite unique key. These 
files have the following binary relationships: 
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Source File 


Item 


Target File 


Item 


Relationship 


CHAPTER 
SECTION 


Chap_No 
Chap_No, 


SECTION 
PARAGRAPH 


Chap No 

Cbap_No, 
Sect_No 


UR 
UR 



Both these binary relationships are i4 has_children" type 
because records of SECTION can only EXIST if the corre- 
sponding record of CHAPTER exists. Similarly, records of 
PARAGRAPH can only exist if the corresponding record of 
SECTION exist. These "has_children" type of UR binary 
relationships can be identified by the fact that the source file 
and target file have a unique key and the target item is a 
repeating key within the unique key of the target file as well 
as the binary relationship being UR. £1 . 

Consider another example where a CUSTOMER file has 
Cust_JSTo as its unique key and another file INVOICES has 
Inv_No as its unique key and Cust_JSTo as its repeating key. 
They both have a binary relationship as follows: 



20 



for example, a file called EMPLOYEE with Erap_No as its 
unique key and two other files MONTHLY_RATED__EMP 
and DAJXY_RATED JMP both of which also have Emp_ 
No as their unique key. As an example, an employee is either 
a monthly rated employee or a daily rated employee but not 
both. We consider this as EMPLOYEE having 
MONTHLY_RATED_EMP and DAILY__RXTED__EMP 
as its subtypes. The following binary relationships between 
EMPLOYEE and the two other files with EMPLOYEE as 
i the source file reflect this "has_subtype" type of binary 
relationships: 



Source 
FUe 



hem Target File 



Item 



Type 
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EM- 
PLOY- 
EE 
EM- 
PLOY- 
EE 



Ec*>_No MONTHLY_RATED_EMP Emp_No UU 



Emp_No DAJLY_RATED_EMP Rmp_No 



UU 



Source File item Target File Item Relationship jf a file has more than one subtype it is possible to 

CUSTOMERS Cust_No INVOICES Cust_No UR automatically identify which of the two opposite UU binary 

, — ■ relationships is a "has__subtype" and which the 4 inverse of 

a n i has subtype" by comparing them with another pair of 

This binary relationship is waros type .In jta reUtionships. The ones with the same 

case, we cannot consider CUSTOMER as Having rr ^ ^ ^ <^_ sabty?c ^ However, it is not 

INVOICES as its "children" since INVOICES have then* to do so if a file has only one subtype. In a later 

own identity through their own u^ue key, namely lav No. £ ^ description, we explain that a user will 

Instead, we could consider CUSTOMER .*% J W«^ so havc to input this knowledge using Mo<lel Purifier 2*. (Note: 

having INVOICES as its "war^ ^ecausc -~ ^ &c $ql language using primary keys and foreign keys, 

belong to their respective CUSTOMER. it • ^sibb to automatically identify which of the two 

A "has.wards" type can be identified by the f act tha both P£ relationships is a <tias_subtype" even if a 

the source and target files have a unique key andthe target PP£ s ^ subt ^ { 

item is not part of te target file unique key as weU as the ^ ^ ^ X extensioD of M odel Purifier 26 to 

Trie RU binary reUUonsrups are the inverse of *e UR ^^^^^ using ^ binary relationships. We 

Binary relationships. We therefore <^ "<^f e ^ ^JSSm the procedure Derive Binary Rela- 

"inverse of Jas_c^ uTsh? 23 can be further extended to identify files that 

wards". As for the NU binary relationships, we aeatean <w uon^ « these file 

invene of it and assign to to inverse a ^as ^ type. 2^ A ^™^£ fiTaliases to be defined can be 

We then store it ^7^^ SSw * tnVfart mat there are more than one binary 

suppose we have an ^gYT| me having a NU^nary «^ y m ^ RU ^ ^ ^ ^ 

SourceFds tern TagetFde Item Rel»W ltcm TttytFUe Item Type 

EMPLOYEE BmriuO* BRANCHES Bnttch-Codc NU _ _ _ MASTER__CODB Ox* NU 

^ EMPLOYEES Citkenship_Code MASTER_ CODE Code NU 

We create an inverse as follows with the type specified as — 

4i has_wards": ^ me namcly MASTER_CODE, and target 

item, namely Code, is the same for both binary relationships. 
For each such binary relationships, we create a file alias of 



^ ito T Be rn Type the target file and replace the target ^ file of fce bmary 

n ' relationship with the file alias, e.g. for the above two binary 

BRANCHES B«ncfc_Code EM- BniDcfa-_Code haa_wirds relationships, we create the following file aliases for the 

FLQYEE MASTER_CODE file and store them in the Knowledge 

60 Base 14: 

This binary relationship is then stored in Knowledge Base EMPLOYEE_Race_Code 

w * __ y < • i rf w c^k « EMPLOYEE Qtizenship__Code and change the binary 

Lastly let us consider the UU binary relationship. Such a t J ^Z,„ F 

binary relationship is called a "has_subtype M type. Consider relationships to. 
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Source File Item 


Target File 


Item 


Type 


EMPLOYEE Race_Code 


EMPLOYEES_Racc_Codc 


Code 


NU 


EMPLOYEE Citizesship_Code 


EMPLOYEES_CUuetiship_Code 


Cock 


NU 



Identify Entity Type of Each File 

In Identify Entity Type of Each File step 34 a database file 
is classified as one of the following types: 

kernel entity 

characteristic entity 

associative entity 

subtype entity 

pure lookup entity 

Kernel entities are entities that have independent 
existence, they are 4 *what the database is really all about". In 
other words, kernels are entities that are neither character- 
istic nor associative, e.g. suppliers, parts, employees, orders, 
etc. are all kernel entities. 

A characteristic entity is one whose primary purpose is to 
describe or "characterize" some other entity. For example, 
the file SKILLS which contains the SKILLS an employee 
has is a characteristic entity of the EMPLOYEE entity. 
Characteristic entities are existence-dependent on the entity 
they described which can be kernel, characteristic, or asso- 
ciative. 

An associative entity is an entity whose function is to 
represent a many-to-many (or many-to-many-to-many, etc.) 
relationship among two or more other entities. For example, 
a shipment is an association between a supplier and a part 
The entities associated may each be kernel, characteristic, or 
associative. 

A subtype is a specialization of its supertype. For 
example, as described earlier MONTHLY_RATED_EMP 
and DAILY _RATE _JBMP are subtypes of EMPLOYEE. 

Lastly, we have entities that look like kernel entities but 
should not be classified as one because their purpose is 
solely for lookup of code description. 

FIG. 13a describes the procedure to identify kernel and 
pure lookup entities. It first identifies those database files 
that are either kernel or pure lookup entities using the 
following rule: 

1) Identify those database files that have a unique key. 

2) Of these database files, eliminate those that are used as 
a target file in any *%as — children" or "has^subtype" 
binary relationships. 

These database files are either kernel or pure lookup 
entities. To distinguish between the two it next uses the 
following rule: 

1) IF such a database file has no "children** or "sub-type", 
i.e. it is not a source file in any * 4 has_childreiT or 
"has_subtype** binary relationship; AND 

2) IF it is not a "ward n i.e., it is not a target file in any 
*'has__wards" binary relationship; 

3) THEN it is a pure lookup entity; 

4) OTHERWISE it is a kernel entity. 

FIG. l&r describes one embodiment of a procedure to 
identify characteristic and associative entities. It uses the 
following rule: 

L) IF a database file appears more than once as a target file 
in "has^children" binary relationships, THEN it is an 
associative entity; 
2) OTHERWISE, IF it appears only once, THEN it is a 

characteristic entity. 
Subtype entities are easily identified as they are the target 
files in "has_subtype" binary relationships. 



Note that the SYSFILES as used in FIG. 13a is a part of 
File Definition List 22 which has been redefined as consist- 
ing of two files, namely 
10 SYSFILES 

SYSFILETTEMS 
SYSFILES is used to store the following: 

file name 

is indicator whether it is an alias file or a real file having 
alias files 
if alias file, its real file name 
entity type 

SYSFTLETTEMS is used to store the following: 

20 

file name 
item name 

item type (e.g. character, numeric, date) 
keytype of item, e.g. unique key, repeating key, or non- 
25 key 

Reclassify Certain Entities and Binary Relationship 35 

Even though we have earlier identified some files as 
kernel entities, some of these kernel entities should be 
reclassified as pure lookup entities. Consider for example a 

30 kernel entity EMPLOYEE having an associative entity 
LaNGUaGE__SP0KEN whose other kernel entity is LAN- 
GUAGE. LANGUAGE_SPOKEN has two items, namely a 
repeating key called Emp_No and another repeating key 
called Language_Code. The LANGUAGE file has only two 
items, namely a unique key item called Language_Code 
and a non-key item called Language_Desc. Even though we 
earlier identified LANGUAGE as a kernel entity since it has 
LANGUAGE _SPO KEN as its "children", LANGUAGE 
should be reclassified as a pure lookup entity since it is used 
solely by the LANGUAGE.. SPOKEN file to obtain the 

40 description of the Language_Code. 

FIG. 14 is a flow chart depicting one embodiment of a 
procedure to identify these kernel entities and modify them 
to pure lookup entities. It uses the following rule to do this: 

1) IF a kernel entity has only associative entities and no 
45 characteristic entities; and 

2) IF it is not a target file in any 4 *has_ward w binary 
relationships; 

3) THEN modify the kernel entity into a pure lookup 
entity. Also, modify the associative entities of this 

50 kernal entity into characteristic entities if the associa- 
tive entities has only one other entity that associates it. 
Next we access all those "has__children" and **has_ 
wards'* binary relationships whose source file is one of these 
pure lookup entities. We then modify them into a new type 

53 called , 'inverse__.of__pure lookup" type. We use the ward 
"inverse" as the lookup direction is not from source file to 
target file but from target file to source file. 
Class Generator 30 
Class Generator 30 (FIG. 11 ) generates a definition of a 

60 class for each kernel entity in the database and stores this 
definition as a Class Definition File (CDF) in Knowledge 
Base 14. A class is a cluster of files whose structure is a tree. 
The root of the tree is a kernel entity which defines the core 
attributes of the class. The tree has the following main 

55 branches: 

(i) a branch for each of the subtypes of the root kernel 
entity 
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(ii) a branch for each of the '"wards" of the root kernel 
entity 

(iii) a branch for each of the characteristic entities of the 
root kernel entity 

(iv) a branch for each of the associative entities of the root 
kernel entity 

These branches are derived using the "has_subtype M . M has_ 
wards**, and "has_children" binary relationships with the 
root kernel entity being the source file. 

Each of the above subtype, characteristic, and associative 
entities could also have their own branches which are their 
characteristic or associative entities. The latter characteristic 
or associative entities could also have their own character- 
istic or associative entities, and so forth. These branches are 
derived using the "ha$_children" binary relationship with 
the target files of these binary relationships forming the new 
branches. The procedure of FIG. 15c and ISb together with 
the sub-procedure Include_Pile (list) of FIG. 16a and Ub 
arc used to derive the above branches which are then stored 
as a set of lists in Class Definition File A (CDF A). An 
example of such a list is: 

Filc=EMPLOYEE 

item=Emp_>Jo 

File=BHXJNGS 

item=Erap_No 

BR Type= 4t has_children" 

item-Proj_No 

File=PROJECTS 

item=Proj_No 

BR type= 4< inv_of_Jias_children w 

This list contains a file EMPLOYEE, linked to a file 
BILLINGS, which is linked to file PROJECTS. The binary 
relationship (BR) type from EMPLOYEE to BILLINGS is 
4 *has_children** using the item Emp_No from both files and 
the BR type from BILLINGS to PROJECTS is tk inv_of_ 
has_childrcn* 1 using the item Proj_No from both files. 

Let us now describe how a list in CDF A is produced using 
this procedure. It starts by initializing a list to the first kernel 
entity in SYSFILES. This kernel entity forms the root kernel 
entity of a class to be generated. Next it looks for a subtype 
entity of this kernel entity using a sorted SYSFILES and the 
Binary Relationships File 24. This file has been sorted in 
descending order of subtypes, kernels, associative, 
characteristics, and pure lookups. If one subtype entity is 
found, it is added to the list together with the name of the 
corresponding binary relationship type, which in this case is 
"has__subrype w and the names of the items used. It then calls 
on a sub-procedure Include JFile (list), for example as 
depicted in FIG. 16a and Itb, to find associative and 
characteristic entities of the subtype entity. If a characteristic 
entity is found, it is added to the list together with the name 
of the corresponding binary relationship type which in this 
case is 4 *has_children*' and the names of the items used. The 
sub-procedure Include_File (list) is then called again, this 
time to find other associative entities or characteristic enti- 
ties of the characteristic entity. If no such entity can be found 
the list is then written to the Class Definition File A (CDF A). 

Besides these branches, the tree of a class also has what 
we called "lookup** branches originating from each node in 
the above branches. These "lookup" branches are derived 
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using the *1nv__of_has_ward$" and "puxe_iookup" binary 
relationships, with the node being the source file and the 
target file forming new branches. Furthermore, the new 
branches could also have their own new "lookup** branches 
and so forth. These subsequent "lookup** branches are 
formed using not only the "inv^f^rxas^wards'* and 
"pure_lookup n binary relationships but also the "inv__of_ 
has_children" binary relationships with the target file form- 
ing the new "lookup** branches. The procedure of FIG- 17 
together with the sub-procedure Processjile of FIGS, la 
and 18* are suitable for use to derive these branches, which 
are then stored as a set of lists in Class Definition File B 
(CDF B). 

Let us describe one example of how a list in CDF B is 
produced using this procedure. It starts by reading in the first 
record of CDF A. A list is then initialized to the first file in 
this CDF Arecord, A checkis made to see if this file has been 
processed before. Since it is not. the sub-procedure 
Process_File (list), for example as depicted in FIG. 18a and 
18b, is then called to look for a "lookup** file for this file. If 
there is such a file, it is added to the lists together with the 
corresponding binary relationship type and the sub- 
procedure is called again. If no further "lookup** file can be 
found, the list is written to CDF B. 

Let us now apply the above procedures on the exemplary 
personnel system below to generate the classes for this 
system. 
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File 


Items 


Item Type 


Key Type 


EMPLOYFP- 


Kmp_No 


Numeric 


Unique key 




"Prop Namft 


Character 


Non-key 




BranclL-No 


Numeric 


Non -key 




Race_Code 


Character 


Non-key 




Address 


Character 


Non-key 




Salary 


Numeric 


Non-key 


MANAGER 


Emp_No 


Numeric 


Unique bey 




Co_Car_No 


Character 


Non-key 


NON__ MANAGER 


Frnp No 


Numeric 


Unique key 




Unira_M_No 


Character 


Non-key 


BRANCH 


Branch_No 


Numeric 


Unique key 




Branch—Name 


Character 


Noo-key 




Br_TbL-EJtpawe* 


Numeric 


Non-key 




Country __No 


Numeric 


Non-key 


RACE_C0DE 


Race_Code 


Character 


Unique key 




Racc_Dcsc 


Character 


Non-key 


SKILLS 


Emp_No 


Numeric 


Repeating key 




Skill 


Character 


Nan-key 


PROJECT 


Proj__No 


Numeric 


Unique key 




Proj_Name 


Character 


Nod key 




Cu8t_No 


Numeric 


Non-key 


BILLINGS 


EmpL.No 


Numeric 


Repeating key 




Proj_No 


Numeric 


Repeating key 




Month 


Date 


Non-key 




Amount 


Numeric 


Non-key 


CUSTOMER 


CusC-No 


Numeric 


Unique key 




Cust__Name 


Character 


Noo-key 


EXPENSES 


Branch_No 


Numeric 


Repeating key 




Month 


Date 


Non-key 




Adv_Exp 


Numeric 


Non-key 




Pers_Exp 


Numeric 


Non-key 


COUNTRY 


Country _No 


Numeric 


Unique key 




Country^Name 


Character 


Non-key 



Using extended Semantics Extractor 12 of FIGS. U and 12, 
the following binary relationships are derived: 



Source Film Item Target File Item Type 



EMPLOYEE Emp_J4o SKILLS Erop_No has_children 

EMPLOYEE Emp_No BILLINGS Enm_ No has_cbildren 
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-continued 



Sounx Film Item 



Target Fik 



Item 



Type 



EMPLOYEE 

EMPLOYEE 

BRANCH 

BRANCH 

COUNTRY 

RACE_CODE 

PROJECT 

CUSTOMER 



Emp_No 

Emp_No 

Bnmch_No 

Branch_No 

Country _No 

Race_Code 

Proj_J*> 

Cusl_No 



MANAGER 

NON_MANAGER 

EMPLOYEE 

EXPENSES 

BRANCH 

EMPLOYEE 

BILLINGS 

PROJECT 



Emp No 

Emp_No 

BrancU_No 

Branch-_No 

Couatry_>Jo 

Race_Code 

Proi_J"Jo 

Cu5L_No 



has__ subtype 

has_subtypc 

has_ward 

ha&_childrea 

inv. of pure_bokup 

inv. of pure_Jookup 

has_chikiren 

inv. of pure_Jookup 



Also, the entity type of each file in SYSFILES is as follows: 



File 


Entity Type 


Short Name of File 


EMPLOYEE 


Kernel 


Kl 


MANAGER 


Subtype 


SI 


NON-MANAGER 


Subtype 


S2 


BRANCH 


Kernel 


K2 


RACE_CODE 


Pure Lookup 


LI 


SKILLS 


Characteristic 


CI 


PROJECT 


Kernel 


K3 


BILLINGS 


Associative 


Al 


CUSTOMER 


Pure Lookup 


hi 


EXPENSES 


Characteristic 


C2 


COUNTRY 


Pure Lookup 


L3 



Let us first apply the exemplary procedure of FIGS. ISa 
and 15b on the personnel system example. U first initializes 
a_list to jhe first kernel entity in SYSFILES which is 
tMfLAjy tit (Kl). i nis means that it is going to generate 
Class Definition File A for the EMPLOYEE class. Next it 
uses a sorted SYSFILES and the Binary Relationship File 24 
to find other entities to add to this list The sorted SYSFILES 
contains files in descending order of subtypes, kernels, 
associatives, characteristics and pure lookups. The next 
entity added to the list is SI which is MANAGER, with the 
items being Emp__No and the BR type being **has_ 
subtype". This list is then written to CDF A. 

The list is initialized again to Kl and S2 added to it, with 
the items being Emp_No and the BR type being "has_ 
subtype", after which it is written to CDF A. 

After the subtypes have been processed, the procedure 
initializes list Kl and search for those kernel entities that are 
k4 wards M of Kl. However, Kl has no "wards" and so there 
are no such entities to add to the list 

Next the procedure searches for associative entities of Kl. 
Kl has one associative entity, namely BILLINGS (Al) so 
Al is added to the list, with the items being Emp_No and 
the BR type being "has_children". Next the procedure 
includes PROJECTS (K3) in mis list as it constitutes the 
other entity that associates Al, with the items being Proj_ 
No and the BR type being "inv_of_Jias_childrcn". This list 
containing Kl, Al, K3 is then written to CDF A. 

After mis, the procedure initializes the list to Kl again and 
search for characteristic entities of Kl. Kl has one charac- 
teristic entity, namely SKILLS (CI), so CI is added to the 
list with the items being Emp_No and the BR type being 
*Tias_children". The list is then written to CDF A. 

At this point CDF A contains the following lists; 

a list having Kl, SI 

a list having KL S2 

a list having Kl, Al, K3 

a list having Kl, CI 

The procedure next generates the lists for the next kernel 
entity, namely BRANCHES (K2). K2 has no subtype and 
associative entities but it has (EMPLOYEE) Kl as its 
**wanT. This produces the list K2, Kl, with the items being 
Branch_No and the BR type being "has_jwards*\ This list 



is written to CDF A. It also has EXPENSES (C2) as its 
characteristic entity, This produces the list K2, C2, with the 

15 items being Branch No and the BR type being "has_ 
children", which is also written to CDF A. 

Finally, the list for the third and last kernel entity, namely 
PROJECTS (K3), is produced. However, there is only one 
list namely 

20 a list having K3, Al, Kl 

since PROJECTS has an associative entity only which is 
BRANCH (Al), with Kl being (he other entity that associ- 
ates Al. In this list the items for K3 and Al is Proj_No with 
the BR type being "has^children". The items for Al and Kl 
is Erap_No with the BR type being 44 inv_of_has_ 

25 children". 

Let us next apply the exemplary procedure of FIG. 17 on 
the personnel system example. It uses the lists of CDF A 
derived earlier to generate the lists for CDF B. Using the 
same personnel system it first initializes a list to the first 

30 entity in the first list of CDF A. namely Kl. It then adds to 
this list entities that are lookup entities to Kl. A lookup 
entity is a target file in a * t pure_lookup" or **inv_of__has_ 
ward" binary relationships, with Kl as the source file. Kl 
looks up on BRANCH (K2). so K2 is added to the list with 

33 the items being Branchlj^o and the BR type being "inv_ 
of_Jias_wards". Next a check is made on K2 to see if it too 
has lookup entities. K2 in fact has one, namely COUNTRY 
(L3). L3 is therefore added to the list with the items being 
Country __>Jo and the BR type being "purejookup" . At this 
point the list contains Kl, K2, L3. Since L3 has no lookup 
entities, this list is written to the CDF B. Next the procedure 
returns to K2 to see if K2 has other lookup entities. Since it 
does not, the procedure returns to Kl. It finds that Kl has 
another lookup entity, namely RACE_CODE (LI). The list 
containing Kl, L3 with the items being Racc_Codc and the 

« BR type being "purejookup", is then written to CDF B. 
The next entity in the first list of CDF A is next processed. 
This entity is MANAGER (SI). It however, does not have 
any lookup entities and so it is ignored. 
As there is no more entity on the first list of CDF A, the 

50 first entity in the second list of CDF A is considered for 
processing. A check is made first to see if this entity has 
already been processed earlier. This entity is Kl which has 
been processed earlier and so it is ignored. The next entity 
is NON^MANAGER (S2) which has not being processed. 

35 However, it does not have any lookup entities and so it is 
ignored. 

The above procedure is again applied for the next list of 
CDF A, which contains Kl, Al, K3. As Kl has already been 
processed and Al has no lookup entity, no list is produced 
for either of them. However, K3 (PROJECT) has a lookup 

60 entity, namely CUSTOMERS (L2). So the list K3. L2 is 
produced with the items being Cust_No and the BR type 
being **pure_lookup r \ It is written to CDF B. The next list 
of CDF A is Kl, CI. However, since Kl has already been 
processed and CI (SKILLS) has no lookup entities, both are 

65 ignored. 

The above procedure is applied to the remaining lists in 
CDF A. In this example, only one list is produced, contain- 
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ing K2. L3 with the items being Country __No and the BR 
type being "pure^Jookup". 

Besides CDF A and CDF B. the Class Definition File also 
includes CDF C CDF C includes a single list which contains 
all the pure lookup entities. 

The CDF C for the personnel system example contains 
LI, L2. L3. 
E-R Model Translator 

E-R Model Translator 31 (FIG. 11) is used to produce an 
Entity-Relationship (E-R) model for each class using CDF 
A. CDF B and Binary Relationships File 24. Many different 
embodiments of an E-R model can be produced. The fol- 
lowing describes one example of a procedure to produce one 
embodiment of an entity-relationship (E-R) model of a class 
for all the classes except the class containing the pure lookup 
entities. This proce dure begins by creating another file l$ 
identical to SYSFTLETTEMS. This duplicate file is called 
TEMPFILEITEMS- Next, for each "inverse^. of_pure_ 
lookup** binary relationships in Binary Relationship File 24, 
it inserts all the items of the file used as source file in this 
binary relationship except those items of th e file that are 
used as source items into this TEMFFILETTEMS at the point 20 
which corresponds to the target items of this binary rela- 
tionship. Next, for each lc has_children tt , "has_wards'\ or 
"has_subtype" type of binary relationships in Bin ary Rel a- 
tionship File 24. it deletes those items in TEMPFILETTEMS 
that are used as target items in such binary relationships. 

The procedure then gradually builds the E-Rmodel far 
each class making use of the resultant TEMPFILETTEMS. It 
first include all the items of the root kernel entity of the class 
into the E-R model of the class. These items are obtained 
from TEMPFILEITEMS. Next it applies the following 
Decision Tables I and 2 on the CDF A and CDF B of the 
class to aeterrnine the relationship names between two 
adjacent entities in the class which are not pure lookup 
entities. 



DECISION TABLE 1 



lb define relationship names between 
oHj^t ^jrifta in the CDF A lists 

Ruk From Entity To Entity BR Type Relationship Name 



K S na*_subtype K is a S 

K S haa_ wards K has S 

K K' ha4_w«rts K baa K 1 

X C has_childfen X has C 

X A haa_ children a.XhasY 

b. X and Y have A 
(Yis another entity 
that associates A) 



25 
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DECISON TABLE 2 


Rule 


Prom 
Entity 


To *t*fry relationship names between 
two adiaceot entities in the CDF B lists 


To 

Entity BR Type Relationship Name 


1 


X 


C or A iuv__ot_h23_wards X references C or A 


2 


K* 


K mv_af_Jiafl_waTds K* belongs to K 


3 


C or A 


K inv_ol_hafl_wards C or A references K 


4 


Cor A 


K inv_ot_nas_children C or A belongs to K 



35 



40 



45 



50 



55 



Legend 

X - kernel, rubtype, characteristic, or associative entity 
C - characteristic entity 
A - associative entity 
KJC - kernel entities 

Y • other entity that associates associative entity 

For each relationship name Identified using the above 
tables, the relationship name and the items of the file 



corresponding to the second entity of the two adjacent 
entities that establish this relationship name are included in 
the E-R model of the class. The items included are obtained 
from TEMPFILEITEMS. This E-R model is then stored in 
the E-R Model of Class Hie 32. 

Let us now illustrate this procedure by applying it on the 
exemplary personnel system described earlier. We know at 
this stage that this personnel system has three classes which 
are derived from the kernel entity of EMPLOYEE, 
BRANCH* and PROJECT. Let us call these three classes: 

ABOUT EMPLOYEE 

ABOUT BRANCH 

ABOUT PROJECT 

If we were to apply the procedure, we should get the 
following E-R models for the three classes: 



ABOUT EMPLOYES 

Emp_No 

Emp Name 

Race_Code 
Race— Desc 
Address 
Salary 

<EMHLOYEE belongs to BRANCH> 

Branch—No 

Branch— Name 

Br_Tbt_Expenses 

Country_Code 

Country _J*ame 
<EMPLOYEE is a MANAGER> 

Co_Cir_No 
<£MFLGYEE i» & NCN_>iANACER> 

Unioa3L-No 
<EMFLOYEE has PROJECT> 

Proi— Name 

Cust_No 

Cust— Name 
^EMPLOYEE and PROJECT have BILLINGS> 

Month 

Amount 
<EMPLOYEE has SKHXS> 

Skill 

ABOUT BRANCH 

Branch— No 

Branch—Name 

BR— Tot-Expenses 

Country_Codc 

Country_Naine 

<BRANCH ha» EMPLOYEE^ 

Rm p N o 

Emp__ Name 

Race_jCode 

Race— Desc 

Address 

Salary 

<BRANCH has EXPENSES> 
Month 
Adv_Eip 
Pcrs_£ip 
ABOUT PROJECT 

Proj-No 
Proj__Name 
Cust^No 
Cust_Name 

<PROJ6CT has EMH-OYEE> 
Emp__No 
Emp— Name 
Race_Code 
Racc^Desc 
Address 
Salary 

<EMPLOYEE belongs to BRANCH> 
Brancn_J*o 
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BraoctC-Name 

Br__Tot__Expense* 

Country_Code 

Countiy_Name 
<PROJECT and EMPLOYEE bave BILLINOS> 
Month 
Amount 



Let us now explain how these E-R models are produced 
when the procedure is applied. First another file identical to 
SYSFILETTEMS is created. This file is called TEMPF1LE- 
ITEMS. For each "invcrse_of_pure__lookup" binary rela- 
tionship in Binary Relationship File 24. the procedure 
inserts in TEMPFILETTEMS. at the point corresponding to 
the target itenu all the items of the source file except the 
source item itself, e.g. there is an **inverse_of__pure 
lookup" 1 binary relationship as follows. 



Source File 


Item 


Target File 


Item 


RACE—CODE 


Race_Code 


EMPLOYEE 


Race__Code 



The items of the source file, namely RACE_CODE. are 
Race_„Code and Race_Desc. The procedure inserts only the 
item Race_Desc (leaving out Race_Code as it is a source 
item) at Race_Code of EMPLOYEE of TEMPFIL ETTEM S. 
so that the items of EMPLOYEE in TEMPFILETTEMS 
becomes : 

Erap_No 

Emp_Name 

Branch _tfo 

Raoe_Code 

Race_Desc 

Address 

Salary 

Next the procedure deletes those items in TEMPFTLE- 
ITEMS that correspond to the target item in tt has__chiidren*\ 
"has_waraV and **has_subtype" binary relationships in 
Binary Relationship File 24. For example, Branch_J*fo of 
EMPLOYEE in the exemplary personnel system is a target 
item in a u has_wards" binary relationship with BRANCH. 
Thus, this item in the TEMPFILETTEMS file is deleted. 

The resultant TEMPFILETTEMS file is then used together 
with the two decision tables earlier described to produce the 
above E-R model for each class which is then stored in the 
E-R Model of Classes File 32. 

Let us show how the procedure produces the E-R model 
for the ABOUT EMPLOYEE class. First all items of 
EMPLOYEE (Kl) obtained from TEMPFILETTEMS are 
included into the E-R model of this class. Next it reads CDF 
B to find records having Kl as the first file. The first such 
record is a list containing Kl, K2, L3. The next file in this 
list is BRANCH (K2) which has a BR type of "inv of__has_ 
wards" with EMPLOYEE (Kl). As Kl and K2 are kernel 
entities with the BR type of "inv_of_has_wards" the 
procedure applies rule 2 of Decision Table 2 to derive the 
following relationship name: 

<EMFLOYEE belongs to BRANCH> 

This relationship name together with the items of 
BRANCH obtained from TEMPFILEITEMS are then 
included into the E-R model of ABOUT EMPLOYEE. 

The next file after K2 in the above CDF B record is L3. 
As L3 is a pure lookup entity the procedure ignores it and 
proceeds to read in the next record of CDF B having Kl as 
the first file. The record is a list containing Kl, LI. However, 
since the next file in this list, namely RACE_CODE (LI) is 
a pure lookup entity, the procedure ignores it 



[9,079 

30 

As mere are no more records In CDF B having Kl as the 
first file, the procedure starts to read the CDF A file. The first 
record of CDF A is a list containing Kl, SI. As SI is a 
subtype entity with a BR type of "has_subtype tt to Kl, the 
5 procedure applies rule 1 of Decision Table 1 to derive the 
relationship name: 

<EMPLOYEE is a MANAGER> 

This relationship name together with the items of MAN- 
AGER obtained from TEMPFILETTEMS are then included 
10 in the E-R model of ABOUT EMPLOYEE class. 

Next, the procedure reads the CDF B file to look for 
records containing SI as the first file. However, there are no 
such records. It then proceeds to read in the next record of 
CDF A. This record contains Kl. S2. Using the same rule as 
applied to SI above, the procedure derives the following 
relationship name: 

<EMPLOYEE is a NON-MANAGER> 

This relationship name together with the items of NON- 
MANAGER obtained from TEMPFTLEITEMS are then 
included in the E-R model of ABOUT EMPLOYEE class. 
20 The procedure next reads the CDF B file to look for 
records containing S2 as the first file. However, there are no 
such records. It proceeds to read in the next record of CDF 
A This record contains Kl. Al. K3. As Al (BILLINGS) is 
an associative entity and K3 (PROJECTS) is the other entity 
23 that associates Al, the procedure first applies rule Sa of 
Decision Table 1 using Kl and K3 to derive the following 
relationship name: 

<EMFLOYEE has PROJECTS> 

This relationship name together with the items of 
30 PROJECTS obtained from TEMPFILEITEMS are then 
included into the E-R model of ABOUT EMPLOYEE class. 
Next the procedure reads CDF B file to look for records with 
K3 as the first file. There is one such record, namely K3, L2. 
However, as L2 (CUSTOMER) is a pure lookup entity, the 
35 record is ignored. The procedure then applies rule Sb of 
Decision Table 1 on the current CDF A record, namely the 
list containing Kl. Al, K3, to derive the following relation- 
ship name: 

<EMPLOYEE and PROJECT have BHJJNGS> 
40 This relationship name together with items of BILLINGS 
obtained from TEMPFILETTEMS are then included in the 
E-R model of ABOUT EMPLOYEE Class. 

The procedure next reads the CDF B file to look for 
records with Al as the first file. However, there is no such 
45 file. It then proceeds to read in the next CDF A record This 
record contains Kl, CI. Applying rule 4 of Decision Table 
1, the procedure derives the following relationship name 

<EMFLOYEE has SKILLS) 

This relationship name together with the items of SKILLS 
obtained from the TEMPFILETTEMS file are then included 
in the E-R model of ABOUT EMPLOYEE class. 

The procedure next reads the CDF B file to look for 
records having CI as the first file. However, there are none 
and so it proceeds to read the next CDF A record. However, 
there are no more CDF A records. This ends the E-R model 
55 translation for the ABOUT EMPLOYEE class from its class 
definition. 

Besides the procedure just described, the E-R Model 
Translator also has another procedure which creates a class 
using CDF C to contain all the pure lookup entities. For the 
60 personnel system example, this procedure creates the fol- 
lowing class: 
ABOUT PURE LOOKUP ENTITIES 
<RACE_CODE> 
Race_Code 
65 Race__Desc 

<CUSTOMER> 
Cust_No 
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Cust_ Name 

<COUNTRY> 

Country _No 

Country _Namc 
Extension To Existing Modules 

Besides extending Semantics Extractor 12. in one 
embodiment Model Purifier 26, and/ox Report Item Selector 
16 are also extended The extension of Model Purifier 26 
provides an interface to allow a user to specify, for two 
opposite binary relationships of UU (unique key to unique 
key), which binary relationship is the "has_subtype w and 
which is the "inverse of has_subtype" type, if a file has only 
one sub-type. However this is not necessary for embodi- 
ments using the SQL language because it is possible to know 
this from we primary and foreign keys. 

The extension of Report Item Selector 16 provides an 
interface that displays all or selected classes and their E-R 
model, as well as allowing a user to formulate a query by 
picking the desired class attributes from the E-R model of 
the classes. 

Model purifier 26 and Security Model Specifier 29 are 
optional modules to this embodiment 

In conclusion, this embodiment provides an alternative 
way of presenting the database items for end-users to select 
to formulate their queries. This presentation of using classes 
and their E-R models is more intuitive and meaningful than 
a single list of items. As a result, this embodiment makes it 
easier for end-users to formulate relational database queries. 
INTEGRATION WITH ELECTRONIC MAIL 

In one embodiment of this invention, the end-user query 
facility is integrated with an electronic mail (E-mail) system 
to allow a user located at a site with no on-line access to the 
application database but with connection to an E-maii sys- 
tem to still be able to make a query of the application 
database. The E-mail system provides the delivery mecha- 
nism for the user's query to be transmitted to the site where 
the query will be processed, and for the query results to be 
returned to the user after processing. The integration with 
the E-mail system also provides one convenient method for 
all queries made via the E-mail system to be logged into a 
log file. Information such as user id, items selected by user, 
the number of records extracted from database, the time of 
query, and the time taken to process the query can all be 
logged* 

FIG. 19 shows one example of this embodiment. User A 
at a site with no on-line access to the Application Database 
59 interfaces with Report Item Selector 16 of Information 
Scout 15 to select the items to be reported. Note that in this 
embodiment, the Information Scout consists only of the 
Report Item Selector which interfaces with the Send-Mail 
Agent instead of the Inference Engine 17. The Send-Mail 
Agent 41 then takes the items selected and stores them into 
a file which is then attached to an E-maiL This E-mail is then 
sent to the Query Mailbox 40. The Open-Mail Agent 42 
periodically checks for E-mail in the Query Mailbox 40. If 
there is an E-mail, it reads the E-mail to get the file attached. 
This file contains the items selected by the User A. It then 
passes the items selected to the Inference Engine. Note that 
in this embodiment the Inference Engine 17 interfaces with 
the Open-Mail Agent 42 instead of the Report Item Selector 
16. The Inference Engine then identifies one or more data- 
base files which contain the desired information and 
searches the Knowledge Base 14 to determine the linkages 
connecting the identified files. Program Generator 18 then 
generates a Source Program 19 based on the linkages 
inferred. Source Program 19 is then compiled and executed 
against the Application Database by the Compiler/Executor 
20. The query result obtained is then composed into an 
Ermail by the Open-Mail Agent 42 and sent to User A 
Mailbox 43. User A obtains his query result by reading this 
E-mail. 



The following describes in greater detail the Send-Mail 
Agent 41 and the Open-Mail Agent 42 based on the inte- 
gration with Microsoft Mail, a product of Microsoft 
Corporation, USA. 

The procedure for the Send-Mail Agent 41 is given as a 
representative section of code in Program Listing 1. It is 
written in Microsoft's Visual Basic under Microsoft's Win- 
dows operating system and uses the Microsoft's Electronic 
Forms Designer as well as the Microsoft's Messaging Appli- 
cation Program Interface (MAPI) to interface with the 
10 Microsoft Mail E-Mail System. 

The procedure works as follows. After User A has selected 
his items using the Report Item Selector 39 which has been 
implemented using Microsoft's Electronic Forms Designer, 
the Send-Mail Agent 41 is invoked. This Agent first stores 
15 the items selected into a temporary file called query.dat 
which is then assigned to another file called FName. Next a 
function called WriteMessage is called to attach this file to 
an E-Mail created by Report Item Selector 39. The Write- 
Message function is one of the functions provided by the 
Microsoft's Electronic Forms Designer. This function uses 
20 the command MEFAddAttachment ( ) to attach the FName 
file to the E-Mail. This E-Mail is then posted to the Query 
Mailbox 40 by using the command MAPISendMail. This 
completes the description of Send-Mail Agent 41. 
The procedure for Open-Mail Agent 42 is given as a 
25 representative section of code in Program Listing 2. It is 
written in C language under Microsoft's Windows operating 
system and also uses Microsoft Mail's Messaging Applica- 
tion Program Interface (MAPI) to interface with Microsoft 
Mail E-Mail system. 
30 This procedure works as follows. It periodically checks 
Quay Mailbox 40 of the E-Mail system using the subroutine 
vTimerServeProc( ) running under subroutine WinProc( ). 
VTimerServeProc( ) in turn uses the MAPI command 
MAPIFindNext to find out whether there is any E-Mail in 
the mailbox. If there is, it then issues the MAPI command 
MAPIReadMail to read the E-Mail. At the same time it logs 
the time the message is read into the server log file (which 
is one of the two files under Log File 44. the other being 
queue log file) by calling the sub-routine vServerLog. It next 
processes the attachment in the E-Mail. The attachment is a 
40 file where the items selected by a user as his query is stored. 
This file is called query.dat vTimerServerProc next calls 
vServeAttach to process this attachment v Serve Attach first 
logs the user name, the subject and the date received into the 
server log file. It then calls Inference Engine 17 which is 
45 implemented as a sub-routine called lAppFHEng. This sub- 
routine also contains Program Generator 18 to generate the 
Powerhouse Quiz source program based on the linkages 
determined by Inference Engine 17. It next uses Compiler/ 
Executor 20 to get the query result from the database, 
so Compiler/Executor 20 is implemented as a sub-routine 
called 1 AppPHRet. The query result is stored in a file called 
RESU1XXXX where XXX can be XLS, MDB, TXT, DBF 
which are the file extensions for the different file formats 
available. Next the vServeAttach logs the number of rows of 
records retrieved from the database that make up the query 
result. It also logs the selection filter used in the user query 
as well as the items selected into the queue log file of Log 
File 44 using the sub-routine vQueueLog. Lastly, it com- 
poses an E-Mail to contain the query result and then use the 
MAPISendMail command to send this E-Mail to User A 
60 Mailbox 43. This completes the description of the Open- 
Mail Agent 

Thus by integrating the query facility to an E-Mail 
system, a user who has no on-line access to his application 
database is still able to make his query using the E-Mail 
65 system. Another advantage of this embodiment is that all 
queries made can be logged for historical analysis. 
Furthermore, the embodiment allows queries to be pro- 
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cessed in batch mode during off-peak hours whenever there 
is excessive load on the database system during peak hours. 
ALTERNATIVE WAY OF PRESENTING QUERY 
RESULTS TO HELP PREVENT MISINTERPRETATION 

This alternative embodiment provides an alternative way 
in which query results can be presented to help prevent users 
from misinterpreting their query results. Query results can 
possibly be misinterpreted by users because of their 
complexity, which is in turn due to the queries being 
complex. A complex query is one which is made up of many 
basic queries with each basic query having its own distinct 
result What a user gets when he makes a complex query is 
a report in which these distinct basic query results have been 
compounded. This compound report is not always easy far 
the user to understand or interpret This embodiment breaks 
down a complex query into its basic components and 
processes each basic component (which is a basic query) 
separately so that what a user finally receives as a result is 
a disjoint set of distinct basic query results which the user 
can easily understand or interpret. 

This alternative embodiment involves adding a new mod- 
ule called the Knowledge Thread Analyser 50 to the query 
facility of FIG. 14 described earlier. An exemplary main 
flowchart for this alternative embodiment is shown in FIG. 
20. 

Knowledge Thread Analyser 

Before we describe the function of the Knowledge Thread 
Analyser 50. let us explain how query results can be 
misinterpreted by illustrating two different cases. 
Case A: 

Suppose we have two database files as follows: 



File 


Item 


Key 


INVOKE 


Inv-No 


Unique key is lav-No 




Inv-Ami 






Customer 




INVOKE_LINES 


Inv-No 


Repeating key is Inv-No 




Part-No 






Qty 






Price 





Suppose also that INVOICE has only one record as 
follows: 



Inv-No Inv-Amt 
1 $1000 



and INVOICE _LINES has 3 records as follows: 



Inv-No 


Part-No 


Qty 


Price 


I 


A 


2 


$30 


1 


B 


3 


$100 


1 


C 


7 


$175 



Suppose a user picks the items Inv-No* Inv-Amt. Part-No, 
Qty and Price as his query. The following query result would 
be produced after the query is processed. 



Inv-No Customer Inv-Amt Part-No Qty Price 

1 IBM $1000 A 2 $50 

1 IBM $1000 B 3 $100 

1 IBM $1000 C 7 $175 



This result is correct but can be misinterpreted by a user 
because the Inv-Amt of $1,000 is repeated for every record 



of INVOICE _JJNES. A user may total up the Inv-Amt to 
get what he thinks is the total invoice amount, namely 
53000. This is wrong as there is only one invoice amount 
which is $1000. 

This query is complex and can be broken down into two 
basic queries. One basic query is about finding out the 
amount for each invoice for each customer. The items 
Inv-No, Customer and Inv-Amt from the INVOICE file 
would provide this information. The other basic query is 
about finding out the details of each invoice, which details 
can be obtained from the items Inv-No. Customer. Part-No. 
Qty, and frice. The above query result is a compound report 
of these two basic queries. However, if both these basic 
queries were processed separately we would get a disjoint 
set of basic query results as follows: 



Inv-No 


Customer 


Inv-Amt 


Part-No 


Qty 


Price 


1 


IBM 


$1000 








1 


IBM 




A 


2 


$50 


1 


IBM 




B 


3 


$100 


1 


IBM 




C 


7 


$175 



Notice mat with this disjoint set of basic query results the 
Inv-Amt is now correctly presented as a single value of 

25 $1000 and not a repeating value of $1000. 

This case can be identified by the fact that the query 
involves two files in which they have a "one to many" 
relationship. Alternatively, we can say that their binary 
relationships is a UR (unique to repeating) or UN (unique to 

30 non-key). We can also say that their binary relationship is a 
,t has_children M or u has_wards M type. Also, one of the items 
selected is from the file whose unique key is used in the 
binary relationship, this item being non-key and numeric. 
For the example above, INVOICE is the file whose unique 

35 key Inv-No is used in the binary relationship with 
INVOICE_LINES. Inv-Amt is one of the items selected as 
part of the query and it is non-key and numeric. 

Consider another case which can also lead to misinter- 
pretation by users. 
CaseB: 

Suppose we have three database files as follows: 
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File 


Item 


Key 


EMPLOYEE 


Emp__Ko 


Unique key is Emp_No 








SKILLS 


Emp_-No 


Repeating key is Emp_No 




Skill 




PAYS 


Emp_J*o 


Repeating key is Emp_No 




Month 






Salary 




Suppose also that these three files contain the following 


records: 






EMPLOYEE: 


Fjnp No 


Exnp_Name 




1 


John 




2 


Sally 


SKILLS: 


EmpL^No 


Skill 




1 


COBOL 




1 


Fortran 




1 


Ada 




2 


C++ 




2 


COBOL 


PAYS: 


Emp_No 


Month Salary 



1 Ird 10OO 
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Feb 
Jan 
Feb 



1000 
2000 
2000 



Suppose a query is formulated by picking the items 
Emp_J*ame. Skill, Month, and Salary. Let us first consider 
the case of using PowerHouse to process this query, fol- 
lowed by the case of using SQL to process this query. The 
reason for doing this is that both PowerHouse and SQL give 
different results, but each may still lead to misinterpretation 
by users. The following result would be produced after this 
query is processed by PowerHouse. 



This query is complex and involves two multi-valued 
dependencies of EMPLOYEE, namely SKILLS and PAYS. 
It can be broken down into two basic queries. One basic 
query is about employees and their skills and the other basic 
query is about employees and their pays. The different query 
results by PowerHouse and SQL shown above are com- 
pound reports of these two basic queries. On the other hand 
the disjoint set of results for these two basic queries are: 



Emp .Name 



Skill 



Month 



Salary 



pjpp Name 


Skill 


Mouth 


Salary 


John 


COBOL 


Jan 


1000 


John 


Fortran 


Feb 


1000 


John 


Ada 






Sally 


C++ 


Jan 


2000 


Sally 


COBOL 


Feb 


2000 



15 



John 
John 
John 
Sally 
Sally 
John 
John 
Sally 
Sally 



COBOL 

Fortran 

Ada 

C++ 

COBOL 



JSD 

Feb 
Jan 
Feb 



1000 
1000 
2000 
2000 
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A user may then use this result to find out the all Salary 
records of each employee with COBOL skill. As the first and 
fifth row of the query result contain COBOL, he would think 
that the Salary records come from these two rows only, 
namely: 



25 



Emp Name 



Month 



Salary 



John 
Sally 



Jan 
Feb 



1000 
2000 



30 



However, this is not totally correct as the correct result 
should include all the Salary records of John and Sally both 
of whom have COBOL skill. The correct result is as follows: 



Emp_Naine 


Month 


Salary 


John 


Jan 


1000 


John 


Feb 


1000 


Sally 


Jan 


2000 


Sally 


Feb 


2000 



Let us now consider the case of using SQL to process 
same query involving Emp _J^ame, Skill, Month and Salary. 
The following result would be produced by SQL: 



Emp_Name 


Skill 


Month 


Salary 


John 


COBOL 


Jan 


1000 


John 


COBOL 


Feb 


1000 


John 


Fortran 


Jan 


1000 


John 


Fortran 


Feb 


1000 


John 


Ada 


Jan 


1000 


John 


Ada 


Fete 


1000 


Sally 


C++ 


Jan 


2000 


Sally 


C++ 


Feb 


2000 


Sally 


COBOL 


Jan 


2000 


Salty 


COBOL 


Feb 


2000 
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This result could be misinterpreted by a user because the 
Salary of John and Sally is repeated many times. He may 
total up the Salary column to nod the total Salary for John 
and Sally. What he would get respectively is 6000 and 8000. 
This is incorrect The correct result is 2000 and 4000, 
respectively since John's salary is 1000 each in January and 
February and Sally's salary is 2000 each in January and 
February. 



60 



Notice now that with this report a user could easily find 
out correctly all salary records of employees who have 
COBOL skill. Also, with these two disjointed basic reports, 
the user would not incorrectly total up the salary of John and 
Sally. 

This case can be identified by the fact that the query 
involves more than one multi-valued dependencies (MVD). 
For example, in the above case the attribute or item SKILL 
is multi-dependent on the attribute or item Emp_No, i.e. 
each employee has a weii-defined set of skills. Similarly, the 
attributes or items Month and Salary are also multi- 
dependent on Emp_No. Alternatively, we can say that it 
involves at least three files in which one file has a "one-to- 
many" relationship or a UR (unique-to-repeating) or a UN 
(unique to non-key) binary relationship with each of the 
other files or in other words, one file has a *lias_children" 
or a •lias^wards" binary relationship with each of the other 
riles. 

Let us now describe Knowledge Thread Analyser 50. It 
takes as its input the knowledge thread (as defined earlier) 
determined by Inference Engine 17 based on the items 
selected by the user from a single class. Since the items 
selected by a user are obtained from a single class, since 
every knowledge thread determined by Inference Engine 17 
has as its thread-head the root kernel entity of the class from 
which the items are selected. Knowledge Thread Analyser 
50 then analyses this knowledge thread, breaking it down 
first into simple knowledge threads to resolve the Case B 
type of complex query problems, and second for each simple 
knowledge thread derived breaking it down further into 
smaller simple knowledge threads to resolve the Case A type 
of complex query problems. The new knowledge threads 
produced as a result of this analysis are then passed to 
Program Generator 18 to generate the corresponding source 
programs which are men compiled and executed to produce 
the query results. 

Before wc describe in detail this exemplary procedure, let 
us recap what a knowledge thread is by giving an example, 
because it will help in understanding this procedure. Sup- 
pose a user makes a query on the ABOUT EMPLOYEE 
class described earlier by selecting the items Emp_Name, 
Salary. Branch_JJame, Br_Tot_Expenses, Country _ 
Name, Skill. Month and Amount The knowledge thread 
determined by Inference Engine 17 is as follows: 
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EMPLOYEE -* BRANCH (NU) COUNTRY (NU) 
-> SKILLS (UR) 
-> BIf J.TNQS (UR) 



This knowledge thread has one thread-head but many 
thread-ends, with the thread-head being the root kernel 
entity of the class. What we want to do is to breakdown this 
complex thread into many simple threads, where each 
simple thread has one thread-head and only one thread-end. 
This complex knowledge thread can be broken down into 
three simple knowledge threads as follows: 
EMPLOYEE -^BRANCH (NU)->COUNTRY (NU) 
EMPLOYEE -*SKTT ,1 .S (UR) 
EMPLOYEE -»BHXINGS (UR) 
Two different types of simple threads can be derived from 
a complex thread determined by Inference Engine 17. One 
type is the NU or RU threads in which all the binary 
relationships between two adjacent files in the thread is NU 
or RU. Another type is the UR threads in which all the binary 
relationships between two adjacent files is UR. One char- 
acteristic of the UR simple threads is that their thread-head 
is same as the thread-head of the complex thread As for the 
NU or RU simple threads, their thread-head is the file on the 
complex thread which starts off the NU or RU relationship. 
For example, the above complex knowledge thread has 
one NU simple thread, namely 

EMPLOYEE -^BRANCH (NU)->COUNTRY (NU) 
and two UR simple threads, namely 
EMPLOYEE— ^SKILLS (UR) 
EMPLOYEE-KILLINGS (UR) 
We shall now describe an exemplary embodiment of 
Knowledge Thread Analyser 50 in detail. It comprises the 
following steps: 
Step 1: Analyze the knowledge thread determined by 
Inference Engine 17 and derive the NU or RU simple 
knowledge threads and the UR simple knowledge 
threads. Store the NU or RU simple knowledge threads 
in NU-RU Thread File and the UR simple knowledge 
threads in UR Thread File. This resolves the Case B 
type of complex query problem. 
Step 2: Access the first simple knowledge thread in 
NU-RU Thread File. Starting from the thread-end ana- 
lyze each pair of adjacent files for Case A type of 
complex query. If a pair has a Case A type of complex 
query, generate a new knowledge thread comprising 
files from the thread-end through to the first file in the 
pair of adjacent files being analyzed, this first file 
forming the thread-head of the new knowledge thread. 
Assign to this new knowledge thread its respective set 
of user selected items such that only its thread-bead has 
user selected items which are both non-key and 
numeric. Store this new knowledge thread in New 
Thread File. Repeat this step for each of the remaining 
simple threads in NU-RU Thread File. 
Step 3: Access the first simple knowledge thread in UR 
Thread File. Starting from the thread-head analyze each 
pair of adjacent files for Case A type of complex query. 
If a pair has a Case A type of complex query, generate 
a new knowledge thread comprising files from the 
thread-head through to the first file in the pair of 
adjacent files being analyzed, with the thread-head of 
the new knowledge thread being the same as the 
thread-head of the knowledge thread being analyzed. 
Assign to this new knowledge thread its respective set 
of user-selected items such that only its thread-end has 
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user selected items which are both non-key and 
numeric. Store this new knowledge thread in New 
Thread File. Repeat this step for each of the reniaining 
simple threads in UR Thread File. 

5 Step 4: For each simple knowledge thread in UR Thread 
File combine it with those NU or RU simple threads in 
the NU-RU Thread File which have their thread-head 
matching a file in the UR simple knowledge thread. 
This complex thread becomes a new knowledge thread. 

10 Assign this new knowledge thread its respective set of 
user selected items such (hat only its UR thread-end has 
user-selected items which are both non-key and 
numeric. Store this new knowledge thread in New 
Thread Rle. 

Step 5: Eliminate duplicate knowledge threads in the New 
Thread File. 

Let us apply this procedure on the earlier example of a 
query on the ABOUT EMPLOYEE class to show how the 
above procedure works. When we apply step L we get the 
20 following NU simple knowledge threads 

EMPLOYEE-»BRANCH <NU)->COUNTRY (NU) 
and the following UR simple knowledge threads: 

EMPLOYEE->SKILLS (UR) 
25 EMPLO YEE->BT I ,T T NGS (UR) 

These simple threads are stored in the NU-RU Thread File 
and the UR Hiread File, respectively. Next we use step 2 on 
the NU simple knowledge thread. The first pair of adjacent 
files starting from the thread-end are COUNTRY and 
™ BRANCH. Since the user-selected item from COUNTRY 
file is Country__Name which is not numeric, there is no Case 
A type of complex query in this pair of adjacent files. So no 
new knowledge thread is generated for this pair. The next 
pair comprises BRANCH and EMPLOYEE. Since one of 
35 the user-selected items from BRANCH is Br_Tot_ 
Expenses which is both non-key and numeric, there is a Case 
A type of complex query in this pair. The following new 
knowledge thread is generated as a result: 

BRANCH-^COUNTKY (NU) 
40 The user-selected items assigned to it are Country_ 
Name, Branch_Name, and Br_Tot_Expenses. Notice that 
only the thread-end, namely BRANCH, has the item which 
is both non-key and numeric, namely Br__Tot_Expenses. 
This new Knowledge Thread is then stored in the New 
45 Thread File. 

As there are no more NU simple knowledge thread, we 
next apply step 3 on the UR simple knowledge threads. The 
first UR simple knowledge thread is 

EMPLOYEE— »S KILLS (UR) 
20 There is only pair of adjacent files in this thread. Since one 
of the user-selected items from EMPLOYEE is Salary which 
is both non-key and numeric, there is a Case A type of 
complex query in this pair. The following new knowledge 
thread is generated as a result: 
55 EMPLOYEE 

The user-selected items assigned to it are Emp_Name 
and Salary. This new knowledge thread is then stored in the 
New Thread File. 

The next UR simple knowledge thread is 
w EMPLOYEE-frBILLINGS (UR) 

For this pair of files, the following new knowledge thread 
is also generated: 

EMPLOYEE 

with Emp^ame and Salary being assigned to it It is then 
65 stored in the New Thread File. 

Next we apply step 4. We combine the NU simple thread 
EMPLOYEE-*BRANCH (NU)->COUNTRY (NU) with 
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the UR simple knowledge thread EMPLOYEE->SKILLS 
(UR) to derive the following new knowledge thread: 



EMPLOYEE 



► BRANCH (NU) ■ 
■+ SKILLS (UR) 



► COUNTRY (NU) 



The user-selected items assigned to it are Emp__Name. 
ranch_Name, Country _Name and Skill. This new knowl- 
edge thread is then stored in the New Thread File. 

Another new knowledge thread is also formed as follows; 



10 



EMPLOYEE 



► BRANCH CNU) - 
-> B1LLINOS (UR) 



. COUNTRY (NU) 



The user-selected items assigned to it are Emp__Name, 
Branch_Name, Month, and Amount. This new knowledge 
thread is then stored in the New Thred Fde. Notice that only 
the UR thread-end, namely BILLINGS, has the user- 
selected item which is both non-key and numeric, namely 
Amount. L 

Step 5 is applied next In this step the duplicate thread, 
namely EMPLOYEE, in the New Thread FUe is eliminated. 
This completes the analysis of the complex knowledge 
thread determined by the Inference Engine 17 by the Knowl- 
edge Thread Analyser SO. 

In this embodiment Model Purifier 26 and Security Model 
Specifier 29 are optional modules. 

Thus this embodiment ensures that end-users will not 
misinterpret their query results whenever they make com 
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firstly to have the same standard data access interface as a 
database connectivity driver so that it can be added as a 
module to the application just like adding a database con- 
nectivity driver, and, secondly, to have an interface that is 
compliant to the same data access interface standard as the 
application so that the end-user query facility can be linked 
to the DBMS of the user's choice by adding a database 
connectivity driver specific to that DBMS in the same way 
as the application can be linked to the DBMS by adding a 
database connectivity driver specific to that DBMS. 

The following describes one embodiment of this inven- 
tion using ODBC from Microsoft Corporation as the stan- 
dard data access interface. However, it is to be understood 
that the teachings of this invention are equally applicable to 
any other standard data access interface. 

Before we describe this invention, let us first describe 
what ODBC is. 
ODBC 

In the traditional database world, an application is tied to 
a specific database management system (DBMS). An appli- 
cation could be a payroll system or a spreadsheet with query 
capability. Such applications are usually written using 
embedded SQL. Though embedded SQL is efficient and 
portable across different hardware and operating systems, 
the source code must be recompiled for each new environ- 
ment Also, it is not optimal if the applications need to access 
data in different DBMS such as IBM DB2 or Oracle. One 
version of the application would have to be recompiled with 
the IBM precompiler and another version with the Oracle 
precompiler resulting in the user having to purchase two 



misinterpret their query results whenever they make com- {L^^ of me same application instead of one in order to 
plex queries as it ensures that the query results produced are 30 £ ^ tQ ftCceM ^ DBMS. 



presented as a disjointed set of basic query results. 
Connections to Existing Database Applications 

Many applications today are built using standard appli- 
cation program interface (API) that allow them to be easily 
linked to other software. One type of standard, which is a 
data access interface standard, relates to the linking of 
applications to different database management systems 
(DBMS). Open Database Connectivity (ODBC) from 
Microsoft Corporation is one of the popular standards for 
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be able to access both DBMS* 

The ODBC interface defined by Microsoft Corporation, 
on the other hand, allows an application to be developed and 
compiled without targeting a specific DBMS. The users of 
such an application then add modules called database con- 
nectivity drivers to link the application to their choice of 
database management systems. These database drivers are 
dynamic link libraries that an application can invoke on 
demand to gain access to a particular data source through a 
particular communications method much like a printer 



Microsoft Corporation is one or me popular sumuanu* particular communications method much like a printer 

such a purpose. This type of standard allows applications to ^ njnning Microsoft's Windows. ODBC provides 
be developed, compiled and shipped without targeting a ^ n „A nr A *x*t aiinu/c Ant* to he shuttled between 

specific DBMS so long as they make function calls to a 
database using the API defined by this standard. In order to 
link such an application to the DBMS of their choice, users 
then add a module called a database connectivity driver 
specific to that DBMS. A database connectivity driver « 
implements the function calls of the standard AH according 



to the requirements of the DBMS they are designed for. 
Many of these applications provide a primitive ad-hoc query 
capability in that users of these applications must understand 
the database model in order to perform ad-hoc query. This 
invention relates to a novel approach for the construction of 
this end-user query facility through the use of what we called 
a Query Facility Connectivity Driver so that we seamlessly 
link the end-user query facility to these applications to 
enhance their query capability while still allowing the link- 
ing of these applications to the DBMS of the users' choice 
using a database connectivity driver specific to that DBMS. 
In accordance with the teachings of the invention this 
seamless linking of the application to the end-user query 
facility is achieved by adding the Query Facility Connec- 
tivity Driver to the application in the same way as adding a 
database connectivity driver to link the application to a 
DBMS. At the same time a user can still link his application 
to the DBMS of his choice as he can continue to add one or 
more database connectivity drivers specific to that DBMS in 
the same manner as before. All these are made possible by 
constructing the end-user query facility to include a Query 
Facility Connectivity Driver and by constructing this driver 
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the standard interface that allows data to be shuttled between 
the applications and the data sources. 
The ODBC interface defines the following: 
A library of ODBC function calls that allow an applica- 
tion to connect to a DBMS, execute SQL statements 
and retrieve results. 
The SQL syntax used is based on the X/Open and SQL 
Access Group (SAG) SQL CAE Specifications (1992). 
A standard set of error codes. 
A standard way to connect and log on to a DBMS. 
A standard representation for data types. 
The ODBC architecture has four components, namely 
Application: performs processing of its application func- 
tions and calls ODBC functions to submit SQL state- 
ments and retrieve results. 
Database Connectivity Manager: loads the database con- 
nectivity drivers on behalf of the application. 
Database Connectivity Driver: processes ODBC function 
calls, submits SQL requests to a specific data source 
and returns results to the application. 
Data Source: consists of the data the user wants to access, 
the associated operating system, the DBMS and the 
network platform (if any) used to access the DBMS. 
FIG. 21 shows the relationship among the four compo- 
nents of the ODBC architecture. Each component carries out 
different functions as listed below. 
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Functions of Application 

Connects to the Data Source by specifying the data source 
name 

Processes one or more SQL statements as follows: 
The Application places the SQL text string in a buffer 
If the statement returns a result the application assigns 
a cursor name for the statement or allows the Data- 
base Connectivity Driver to do so. 
The Application submits the statement for prepared or 

immediate execution. 
If the statement creates a result, the Application can 
enquire about the attributes of the result such as die 
number of columns and the name and type of a 
specific column. It assigns buffers for each column in 
the result and fetches the result. 
If the statement causes an error, the Application 
retrieves error information from the driver and takes 
appropriate action. 
Ends each transaction by committing it or rolling it 
back. 

Terminates the connection when it has finished inter- 
acting with the Data Source. 
Functions of Database Connectivity Manager 

A dynamic- Jink library (DLL) whose primary purpose is 
to load ODBC drivers. 

Also processes several ODBC initialization and informa- 
tion calls. 

Passes ODBC functions calls from Application to Data- 
base Connectivity Drivers. 

Performs error and state checking. 
Functions of Database Coune<Sivity Drivers 

A DLL that implements ODBC function calls and inter- 
acts with a Data Source. 

It performs the following tasks in response to ODBC 
function calls from an Application: 

Establishes a connection to the Data Source. 
Submits requests to the Data Source, 
Translates data to or from other formats. 
Return results to the Application 
Declare and manipulate cursors 
Tasks performed by Data Source 

Requires its name, a user ID and a password to be 
specified by a user of the Application for it to be connected 

Processes SQL requests received from the Database Con- 
nectivity Driver. 

Returns result to the Database Connectivity Driver. 
Query Facility Connectivity Driver 

FIG. 22 shows Query Facility 65 linked to Application 60 
using Query Facility Connectivity Driver 64. Query Facility 
Connectivity Driver 64 "slots" in-between Database Con- 
nectivity Manager 61 and Database Connectivity Driver 62. 
replacing Report Item Selector 16 of Information Scout 15 
(FIG. 1) used in previous embodiments of the query facility. 

Query Facility Connectivity Driver 64 has the same or 
substantially similar ODBC interface as Database Connec- 
tivity Driver 62 so that it can be used by Application 60 as 
though it is a Database Connectivity Driver. However, 
unlike Database Connectivity Driver 62, it implements 
many of the ODBC functions in a different way, since it is 
used to connect Application 60 to Query Facility 65 instead 
of to Application Database... Table 3 gives a comparison of 
the different implementation of the OB DC calls between 
Database Connectivity Driver 62 and Query Facility Con- 
nectivity Driver 64. 

Let us illustrate with an example how an Application 60 
connects to Query Facility 65 through Query Facility Con- 
nectivity Driver 64. and uses it to make powerful ad-hoc 
queries easily. We shall illustrate using Microsoft Excel 
spreadsheet as Application 60 making a query on an Appli- 



cation Database serving as a personnel system having the 
following database tables: 
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Table 


Columns 




EMPLOYEE 


Emp No 


Primary Key is Emp_No 




Emp_Name 






Sex 






Address 




PROJECT 


Proj_No 


Primary Key is Proj_No 




Proj_Name 






Pfoj_Mfliia$er 




BILLINGS 


Emp_No 


Primary Key is Emp_No & Proj_No 




Proj_No 


Foreign Key is Emp_No referencing 






EMPLOYEE 




Month 


Foreign Key is Proj_No referencing 






PROJECT 




Amount 





After semantic extraction of this database model by 
Semantics Extractor 12 of Query Facility 65. Keyword 
Library 13 contains the following keywords: 



Emp_No 

ProjecLJfo 

P«y_No 



Emp_Nune 
Proj__Name 
Mo tub 



Sex 

Proj__Manager 
Amount 



Address 



23 



and Knowledge Base 14 contains the following knowledge 
threads: 
EMPLOYEE-»BILI. JNGS 
PROJECT-iBILLINGS 
EMPLOYEE— ^BILLINGS -^PROJECT 
Suppose a user wants a list of all employee names and the 
names of the projects they are working on. We would now 
like to describe how the user uses Microsoft Excel to make 
this query. We would also like to describe the various actions 
taken by Query Facility Connectivity Driver 64 in response 
to the ODBC calls made by Microsoft Excel in performing 
this query. But before we do so let us explain how Database 
Connectivity Driver 62 and Query Facility Connectivity 
Driver 64 are configured so that they can be used by 
Application 60 which is Microsoft Excel in this query 
example. 

Let us assume mat a Database Connectivity Driver 62 
specific to DBMS 63 is used for the personnel system 
Application Database which has been loaded in. The user 
45 first starts a program called the ODBC Administrator. A 
window for adding data sources is then displayed. He 
presses the "Add" button to add a data source. Another 
window is then displayed to show a list of database con- 
nectivity drivers that have been loaded in. He selects the 
required Database Connectivity Driver 62 for his query on 
the personnel system. This driver then displays a window 
asking him to define his data source. He defines his data 
source to be the Application Database containing the per- 
sonnel system and gives it the name *TERSONNEL_ 
SYSTEM**. This completes the configuration of Database 
Connectivity Driver 62. 

Let us now assume that Query Facility Connectivity 
Driver 64 has also been loaded in. The user again starts the 
ODBC Administrator to display a window for adding data 
sources. He presses the 44 Add** button to add a data source. 
Another window showing a list of database connectivity 
drivers including Query Facility Connectivity Driver 64 
appears. This time he selects Query Facility Connectivity 
Driver 64. This driver then displays a window asking him to 
define his data source. He defines his data source to be 
Keyword Library 13 of Query Facility 65 and gives it the 
name "AB01TT__PERS0NNEL". This Keyword Library 13 
for the query example contains the keywords of the person - 
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functions, since Query Facility 65 is intended only for query. 
(If Microsoft Excel had been used without Query Facility 65 
and Query Facility Connectivity Driver 64. this call would 
be made to Database Connectivity Driver 62 and the results 
returned would be all the functions supported by the DBMS 
including the 'create' and 'update* functions.) 

Next MS-Query issues a number of SQLGetlnfo calls. 
These calls cause Query Facility Connectivity Driver 64 to 
return information about itself and the data source it is 
connected to, which is " ABOUT .PERSONNEL". (If 
Microsoft Excel had been used without Query Facility 65 
and Query Facility Connectivity Driver 64, these calls 
would cause Database Connectivity Driver 62 to return 
information about itself and the data source it is connected 
to which is 4, PERSONNEL_SYSTEM") 

MS-Query next issues a SQLTables call. This call causes 
Query Facility Connectivity Driver 64 to prepare to return 
information regarding the table in Keyword Library 13. 
There is only one table in Keyword Library 13. This table is 
used to hold all the keywords in Keyword Library 13. For 
the query example using the personnel system let us call this 
table PERSONNEL. Following this MS-Query makes a 
SQLBindCol call to associate buffers to hold the table 
definition of the PERSONNEL table. It next issues a 
SQLFetch call to fetch and copy the table definition into the 
buffers. After this it makes a series of calls starting with 
SQLGefTypelnfo followed by SQLBindCol, and SQLFetch 
to get information on the data types used by the Application 
Database. As such information is also contained in Keyword 
Library 13, Query Facility Connectivity Driver 64 obtains 
them from Keyword Library 13. (If Microsoft Excel had 
been used without Query Facility 65 and Query Facility 
Connectivity Driver 64, the SQLiabies call would cause the 
Database Connectivity Driver 62 to prepare to return infor- 
mation regarding the tables in the Application Database. 

^ These tables for the query example are the tables of the 

sourcV"and secondly^© make the "same call to Database 33 personnel system, namely EMPLOYEE, PROJECT and 



nel system. At this point we have only partly configured 
Query Facility Connectivity Driver 64 because, unlike Data- 
base Connectivity Driver 62. we need to configure another 
data source. Another window is displayed for the user to 
configure another data source. He configures the 
"PERSONNEL_SYSTEM" as the other data source which 
by the way is also the data source of Database Connectivity 
Driver 62. This completes the configuration of Query Facil- 
ity Connectivity Driver 64. 

Let us now describe how the user perform his query on the 
personnel system using Microsoft Excel aided by Query 
Facility 65. The user first launches Microsoft Excel. He then 
starts up MS-Query which is a module within Microsoft 
Excel for performing ad-hoc query. A dialog box appears 
showing him a number of data sources that he can connect 
to. He selects "ABOUT ^PERSONNEL" as his data source. 
(If he had used Microsoft Excel of the prior art without 
Query Facility 65 and Query Facility Connectivity Driver 64 
as taught by this embodiment, the **ABOUT_ 
PERSONNEL" data source would not be shown on the 
dialog box and he would have selected "PERSONNEL.. 
SYSTEM" as his data source.) MS-Query then makes the 
following ODBC calls in the order given to connect to the 
selected data source: 

SQLAllocEnv 

SQLAllocConnect 

SQLGetlnfo 

SQLDriverConncct 

The SQLAllocEnv call causes Database Connectivity 
Manager 61 to allocate storage for information about the 
ODBC environment as well as for special information 
required to run Query Facility Connectivity Driver 64 while 
the SQLAllocConnect call causes Query Facility Connec- 
tivity Driver 64. firstly, to allocate storage for information 
about the connection to the "ABOUT_PERSONNEL n data 
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15 



20 



25 



Connectivity Driver 62 to allocate storage for info rmati on 
about the connection to the 'PERSONNEL'S YSTEM" 
data source. The SQLGetlnfo call causes Query Facility 
Connectivity Driver 64 to return its version number. The 
SQLDriverConnect call causes Query Facility Connectivity 



BILLINGS. The next two calls, namely SQLBindCol and 
SQLFetch, would cause Database Connectivity Driver 62 to 
associate buffers and then fetch and copy the table defini- 
tions of the EMPLOYEE, PROJECT and BILLINGS tables 
into the buffers. The next series of calls starting with 



Driver 64 to do a number of things. Firstly, it connects to the 40 SQLGeOypclnfo followed by SQLB^dCoUnd SQLFetch 
"ABOUT_PERSONNEL M data source which is Keyword " 1 ~ ~ ' " 

Library 13 of Query Facility 65. Secondly, it prompts the 
user for the password to Keyword Library 13. Thirdly, it 

makes the same call to Database Connectivity Driver 62 so , , . 

as to connect to the "PERSONNEL_SYSTEM" data source 45 get the keywords used by DBMS 63. ^^f^f^}^^^ 
which is the Application Database containing the personnel - * /:v — v ~**- *~ U1 ~ " ~* f ~ oc ** taW * 

system. (If Microsoft Excel of the prior art had been used 
without Query Facility 65 and Query Facility Connectivity 
Driver 64 as taught by this embodiment, these same four 
ODBC calls would result in different actions being taken. 
The first call would cause Database Connectivity Manager 
61 to allocate storage for information about the ODBC 
environment while the second call would cause Database 
Connectivity Driver 62 to allocate storage for info rmati on 
about the connection to the "PERSONNEL_SYSTEM M 
data source. The third call would cause Database Connec- 
tivity Driver 62 to return its version number while the last 
call would cause Database Connectivity Driver 62 to con- 
nect to the "PERSONNEL_SYSTEM" data source.) 
After the data sources have been connected, MS-Query 
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would cause Database Connectivity Driver 62 to obtain the 
information on data types used by the Application 
Database.) 

MS-Query next issues a number of SQLGetlnfo calls to 



obtained are (i) whether a table is referred to as 'table** or 
"file" by the specific DBMS and (ii) whether an owner is 
referred to as "owner" or "authorization id". These calls 
cause Query Facility Connectivity Driver 64 to in turn make 
the same calls to Database Connectivity Driver 62 to get the 
required information from DBMS 63. (If Microsoft Excel 
had been used without Query Facility 65 and Query Facility 
Connectivity Driver 64 these calls would cause Database 
Connectivity Driver 62 to return the required information 
direct to MS-Query.) 

At this point MS-Query stops issuing more ODBC calls. 
Instead, using the information it has obtained from the 
ODBC calls it has made, it displays a dialogue box showing 
the name<s) of the table(s) from the selected data source. 
This data source is "ABOUT_PERSONNEL" which has 



issues the SQLGetFunctions call to find out what DBMS 63 60 only one table PERSONNEL and so only the table name 



functions are supported by Query Facility Connectivity 
Driver 64. Query Facility Connectivity Driver 64 first passes 
the call to Database Connectivity Driver 62. It then takes the 
return results from Database Connectivity Driver 62 and 



PERSONNEL is shown. To formulate his query the user 
needs a list of the keywords of Keyword Library 13 to be 
displayed. He therefore requests through this dialog box that 
the all the columns of the "PERSONNEL" table be dis- 



modifies them to delete those functions it does not support 65 played. MS-Query then issues a series of calls comprising 
before returning the results to MS-Query. Two of the func- SQLColumns. SQLBindCol and SQLFetch. The first call 
tions it does not support are the 'update' and 4 create' asks Query Facility Connectivity Driver 64 to prepare to 
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return information about the columns of the PERSONNEL 
table in Keyword library 13. The second call causes Query 
Facility Connectivity Driver 64 to associate a result buffer 
with a column in the result set the result set in this case 
being all the keywords in Keyword Library 13. The third call 
causes Query Facility Connectivity Driver 64 to fetch the 
keywords from Keyword Library 13 and place them in the 
result buffer. (If Microsoft Excel had been used without 
Query Facility 65 and Query Facility Connectivity Driver 
64, the names of the tables displayed in the dialog box would 
have been EMPLOYEE, PROJECT and BILLINGS. If the 
user wanted only the columns of EMPLOYEE table and had 
made the appropriate selection. MS-Query would issue the 
same calls and these calls would have caused Database 
Connectivity Driver 62 to prepare to return information 
about the columns of the EMPLOYEE table, associate a 
result buffer and fetch the columns of EMPLOYEE from the 
Application database into the result buffer.) 

After all the keywords of Keyword Library 13 have been 
fetched. MS-Query then displays them to the user as the 
columns of the PERSONNEL table. The user then formu- 
lates his query by selecting the appropriate columns from 
this table. Since his query is to get a list of employee names 
and the names of the projects they are working on. he selects 
the columns Emp_JName and Proj_Name. MS-Query then 
generates the following SQL statement: 

Select 'PERSONNEL'. 'Emp_Namc\ 

'PERSONNEL' /Proj_Marae' From 'PERSONNEL' 
'PERSONNEL* 

It next makes a SQLExecDirect call to have this SQL 
statement executed. Query Facility Connectivity Driver 64 
upon receiving this call parses the SQL statement to extract 
the items selected by the user which in this query example 
are Erap_JJame and Proj_^Name. It then calls Inference 
Engine 17 of Query Facility 65 to determine the access path 
through the personnel system using the items selected- 
Program Generator 18 is then called to use this access path 
to generate an SQL source program. For the query example 
the SQL source program generated is as follows: 

Select 'EMPLOYEE* . *Emp_Name\ *PROJECTVProj_ 
Name' 

From 'EMPLOYEE' 'EMPLOYEE*. 'PROJECT* 
'PROJECT*. 'BILLINGS' 'BILLINGS ' 

Where 4 EMPLOYEE\*Emp_No'=*BILlINGS'.'Emp_ 
No 1 , 'PROJECT'. t Proj_No*=*BILUNGS , .*Proj_No* 

Query Facility Connectivity Driver 64 then issues the 
SQLExecDirect call to Database Connectivity Driver 62 to 
have this SQL source program executed by DBMS 63 to 
obtain the desired information from the Application Data- 
base. (If Microsoft Excel had been used without Query 
Facility 65 and Query Facility Connectivity Driver 64, the 
user would have to understand the personnel system data- 
base model to formulate this query. This means that he 
would have to know firstly that it requires the use of the 
column Emp _Name from the EMPLOYEE table and the 
column Proj_^Name from the PROJECT table. Secondly, 
since these two tables are not directly connected but instead 
are connected through the BILLINGS table he would have 
to know about the relationships between these tables. 
Therefore, in order to formulate this query he must therefore 
request that the columns of the three tables be displayed. Let 
us assume that this has been done and that the display shows 
the columns of the three tables with their appropriate joins. 
The user would then select the column Emp_ - Name of 
EMPLOYEE table and Proj_Name of PROJECT as his 
query. MS-Query would then generate the following SQL 
statement: 

Select 'EMPLOYEE*. 'Emp_J*arae\ 'PROJECT'. 'Proj_ 
Name* 
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From 'EMPLOYEE* 'EMPLOYEE'. PROJECT' 

'PROJECT 1 , 'BILLINGS* 'BILLINGS' 
Where 'EMPLOYEE' /Erap_J*o*=*BILUNGSVEmp_ 
No', 'PROJECT'. 'Proj_No'='BILLINGS 1 .'Proj_No* 
Next it would make the SQLExecDirect call to Database 
Connectivity Driver 62 to send this SQL statement to DBMS 

63 to be executed to extract the desired information from the 
Application Database.) 

MS-Query next issues a series of ODBC calls to get the 
results of the query to be displayed. It first makes the 
SQLNumResultCol call to find out how many columns of 
the query results will be retrieved. For the query example the 
number of columns to be retrieved is two, one being the 
Emp_Name and the other being the Proj_Name of the 
PERSONNEL table. It next issues the SQLDescribeCol and 
the SQLColAttributes calls to obtain information about 
these two columns. To get this information Query Facility 
Connectivity Driver 64 first maps these two columns of the 
PERSONNEL table to the actual database columns, namely 
the Emp _Name of EMPLOYEE table and the Proj_Name 
20 of PROJECT table. Next, it issues the same calls to Database 
Connectivity Driver 62 to get the required information from 
the Application Database. After this MS-Query issues the 
SQLBindCol call to ask Query Facility Connectivity Driver 

64 to allocate buffers. Next it issues the SQLFetch call. The 
25 SQLFetch call causes Query Facility Connectivity Driver 64 

to make the same call to Database Connectivity Driver 62 to 
fetch the query results from the Application Database and 
place them in the associated buffers. (If Microsoft Excel had 
been used without Query Facility 65 and Query Facility 
Connectivity Driver 64, MS-Query would issue the same 
SQLNumResultCcl call tc fisd out how many columns 
would be retrieved. As the columns selected are Emp_ 
Name of EMPLOYEE and Proj_Name of PROJECT, the 
number of columns to be retrieved would be two. MS-Query 
would next issue me same two SQLDescribeCol and SQL- 
ColAttributes calls. This would cause Database Connectiv- 
ity to directly obtain the required information about these 
columns from the Application Database. After this 
MS-Query would issue the SQLBindCol and SQLFetch 
calls which would cause Database Connectivity Driver 62 to 
40 associate buffers and to directly fetch the query results from 
the Application Database and place them in the associated 
buffers.) 

At this point the user has his query results displayed by 
MS-Query. If he does not wish to make any more queries, he 
4S can quit from MS-Query. When he quits this causes 
MS-Query to issue the following calls: 

SQLDis connect 

SQLFreeConnect 

SQUTeeEnv 

The first call causes Query Facility Connectivity Driver 
64 to disconnect from Keyword Library 13 and to also make 
the same call to Database Connectivity Driver 62 to discon- 
nect from the Application Database. The second call causes 
Query Facility Connectivity Driver 64 to free the connection 
handle to Keyword Library 13 and to make the same call to 
Database Connectivity Driver 62 to free the connection 
handle to the Application Database. The third call causes 
Query Facility Connectivity Driver 64 to free its environ- 
ment handle and to make the same call to Database Con- 
nectivity Driver 62 to free its environment handle. (If 
60 Microsoft Excel had been used without Query Facility 65 
and Query Facility Connectivity Driver 64, these three calls 
would cause Database Connectivity Driver 62 to disconnect 
from the Application Database, to free the connection handle 
to the Application Database and to free its environment 
65 handle.) 

Note that in the above explanation for the sake of clarity 
a number of ODBC calls made repeatedly by MS-Query 
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have not been mentioned, e.g. SQLAllocStmt and SQL- 
FreeStrat. The SQLAllocStmt call is used to cause Query 
Facility Connectivity Driver 64 to. firstly, allocate SQL 
statement data structures to contain whatever information 
about Keyword library 13 or the driver itself that will be 
obtained subsequently through other ODBC calls and. 
secondly, to make the same call to Database Connectivity 
Driver 62 to allocate SQL statement handle to contain 
whatever information about the Application Database or the 
driver itself that will be obtained subsequently through other 
ODBC calls. The SQLRreeStmt call is used to get Query 
Facility Connectivity Driver 64 to free the allocated SQL 
statement data structures and to make the same call to 
Database Connectivity Driver 62 to free the allocated SQL 
statement handle. (If Microsoft Excel had been used without 
Query Facility 65 and Query Facility Connectivity Driver 
64. the first of these two calls would be used to cause 
Database Connectivity Driver 62 to allocate an SQL state- 
ment handle and the second would be used to cause Data- 
base Connectivity Driver 62 to free the allocated SQL 
statement handle.) 

While this exemplary description of the Connectivity to 
Exi siting Database Applications embodiment has been 



described with reference to various embodiments described 
previously which utilize Keyword Library 13 as the data 
source of Query Facility Connectivity Driver 64 to generate 
a table for the user to select items for the query, it is to be 

5 appreciated that this embodiment is equally applicable for 
use with other embodiments described above in which E-R 
Model of Classes File 32 (e.g. FIG. 11) is used as the data 
source, where each class in File 32 is treated as a separate 
table to conveniently allow the user to select items for the 

10 query. 

A key benefit of this embodiment of the present invention 
is that it allows the many millions of existing applications 
used by many organizations such as Microsoft Excel 
spreadsheet, Microsoft Access DBMS. Lotus 123 

15 spreadsheet, Powersoft Power Viewer or InfoMaker, Gupta 
Quest, Q+E Database Editor, etc. that are ODBC compliant 
or compliant to other data access interface standard to be 
reused for making powerful ad-hoc queries easily through 
seamless Integration of Query Facility 65 thus saving on the 

20 cost of purchasing new tools to have more powerful ad-hoc 
queries as well as the cost of training users to use the new 
tools. 



TABLE 3 



L2U 



.pfamrntation of ODBC Calls Between a Dattb*-^ Connectivity Driver and the Query Facility Connectivity Driver 



ODBC API 



Database Connectivity Driver 



Query Facility Connectivity Driver 



RETC ODES QL_APTSQLIables< 
HSTMT hstmt, _ 
UCHAK FAk *sz tabicQuaiiner, 
SWORD cbTableQualifier, 
UCHAR FAR •azTableOwner, 
SWORD cbTableOwner, 
UCHAR FAR *ft*TabJe Name, 
SWORD obTabkNamc, 
UCHAR FAR *s2TableType, 
SWORD cbTableType) 
RETCODESQL_ J AFISQLStati»tic9( 
HSTMT hstrnt, 

UCHAR FAR •azTableQualifier, 
SWORD cbTableQualifier, 
UCHAR FAR •azTablcOwner, 
SWORD cbTableOwoer, 
UCHAR FAR •arflVbleName, 
SWORD cbTableName, 
UWORD rUnique, 
UWORD fAccuracy) 
RETCODESQL_APISQLTablePrivUegts( 
HSTMT bstmt, 

UCHAR FAR 'salable Qualifier, 
SWORD cbTableQualifier, 
UCHAR FAR ♦wTaMeOwoer, 
SWORD cbTableOwoej, 
UCHAR FAR *szTabk>Name, 
SWORD cbTableName) 
RBTCODBSQL_AnSQLCoruniiiPrivileges( 
HSTMT hstmt, 

UCHAR FAR *s*TableQualifier, 

SWORD cbTableQualifier, 

UCHAR FAR ♦szffabkOwner, 

SWORD bTableOwner, 

UCHAR FAR ♦szTableName, 

SWORD cbTabJeName, 

UCHAR FAR •szColumnName, 

SWORD cbCohmmNamc) 

RFTYJODESQL_APlSC^pccialCoiumcs( 

HSTMT be tool, 

UWORD flCofiype, 

UCHAR FAR •WTableQualifier, 

SWORD cbTableQualifier, 

UCHAR FAR •szTableOwoer, 

SWORD cbTableOwoer, 

UCHAR FAR •szlabkNime, 

SWORD cbTableNamc, 



The driver prepare* to return information The driver prepares to return information 

regarding the tables in the Application Database. regaMing thejtabte which is used to boW^E^he^ 

Facility 65. 



Seta up die driver to return statistics about the Sets up the driver to return statistics about the 
ttbksin the Application Database, table which is used to bold all the keywords in the 

Keyword Library 13 of the Query Facility 65. 



Sets up the driver to return the privileges of the 
Application Database's tabic*. Privileges may be 
"SELECT*, "UPDATE", "DELETE", or 
** REFERENCES" . 



Sets op driver id return the privileges of the 
columns of a particular table in the Application 
Database. Privileges may be "SELECT', 
"UPDATE", "DELETE", or "REFERENCES". 



Sets up die driver to return the privileges of the 
Keyword Library's table which is used to hold the 
keywords. The only privilege allowed for the 
table is "SELECT since Keyword Library 13 is 
used for read only. 



Sets up the driver to return the privileges of the 
columns of the Keyword Library's table used to 
hold the keywords. The only privilege allowed for 
these columns "SELECT since the Keyword 
Library 13 is used for read only. 



Sets up the driver to return (he required columns. Return SQL_SUCCESS t but on subsequent 

SQLFeteh/SQLGetData, return 
SQL_NO_TJ>ATA_FOUND, 
There is no need to identify columns that uniquely 
identifies a row because the Query Facility 65 
performs all table unking using Its Inference 
Engine 17 without any intervention from the 
Application 60 or the user of the Application 60. 
There is no need to return mformation regarding 
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TABLE 3-continued 



Comparison of Implementation of ODBC Calls Between a Database Connectivity Driver and the Query Facility Connectivity Driver 



ODBC API 



Database Connectivity Driver 



Query Facility Connectivity Driver 



UWORD fScopc, 
UWORD fNullabk) 

RETCODESQL_APISQU > rimaryKeys( 
HSTMT hstmt, 

UCHAR Far »szTableQualifier, 
SWORD cbTabteQualifier, 
UCHAR FAR •sxTablcOwner, 
SWORD cbTabkOwner, 
UCHAR FAR *szTableName, 
SWORD cbTableName) 
RETCODE SQUJtflSQLFc>reignKeys( 
HSTMT hstmt, 

UCHAR FAR 'szRTabkQualifcr, 
SWORD cbPkTableQualifier, 
UCHAR FAR *szPkTableOwner, 
SWORD cbPkTabfoOwnei, 
UCHAR FAR *S2PkTableName, 
SWORD cbPkTaWeName, 
UCHAR FAR •szFkTabteQualiaer, 
SWORD cbFkTaWeQualifier, 
UCHAR FAR * szFkTableOwner, 
SWORD chFkTaWeOwneT, 
UCHAR FAR •szFMableName, 
SWORD cbFkTaWeName) 
RETCODESQL_APlSQLAlIocEnv( 
HEKV FAR *phenv) 

RETCODBSQL_APlSQJJUk)cC«mect( 
HEKV henv, 
KD&C FAR "phwfcc) 



RETCODES QL__AFLS QLCoi 
HDBC hdbcParam, 
UCHAR FAR *aDSS y 
SWORD cbDSN, 
UCHAR FAR »aUID, 
SWORD cbUID, 
UCHAR FAR •szAuthStr, 
SWORD cbAuthStr) 
RETCODES QL_^AP!DriYerConnect( 
HDBC hdbc, 
HWNDbwnd, 
UCHAR FAR *szConnStrIn, 
SWORD cbConaStrla 
UCHAR FAR *szConaS<rOut T 
SWORD cbCconStrOutMax, 
SWORD FAR •azpcbCocnStrOut, 
UWORD fl>WejComplction) 
RETCODES QL_APlSQlirvwseCcaiDect( 
HDBC bdbc, 

UCHAR FAR *«rCotmStrIn, 
SWORD cbCocmStrln, 
UCHAR FAR •szConnStrOut, 
SWORD cbConnStrOutMax, 
SWORD FAR •pcbCoimStrOut) 
RETCODESQL_APISQLDiococncct( 
HDBC hdbcParam) 

RETCODES QL_APISQLFrceCooncct( 
HDBC hdbcParam) 

RETCODES QL_APISQLFrecEijY( 
HBNVbenv) 

KETCODESQL_APIS QLExecute( 
HSTMT hstmt) 

RETCODES QL__APlSQLExecDxrect( 
HSTMT hstmt, 
UCHAR FAR •szSqlStr, 
SDWORD cbSqIStr) 



Sets up the driver to return the required primary 
beys of the Application Database. 



Sets up the driver to return the required foreign 
beys of the Application Databade. 



columns that are automatically updated since the 
Query Facility 65 does not support updates to the 
Application Database or to the Keyword Library. 
Return SQL_SUCCESS, and the return 
SQL_NO_DATA_FOUND at 
SQLFetoh/SQLGetData. 



Return SQL_SUCCESS. and the return 
SQL__NO_J)ATA_FOUND at 
SQLFetch/SQLOetData. 



Allocate storage for information about the ODBC Allocate storage for information about the ODBC 

environment as well as special information 
required to run the Query Facility 65. 
Allocate storage for information about « 
connection to the Keyword Library 13 of Query 
Facility 63 as a data soucre. Also make same call 
to Database Connectivity Driver to allocate storage 
for mfonnation about a connection to the 
Application Database as a data soucre. 
Connects to the Keyword Library 13. Prompts the 
user for Keyword library 13 password Makes 
SQLAUocEnv and SQTJUtocConneet calls to the 
Database Connectivity Driver. Finally, connects to 
the Application Database by making the same 
SQLCormcct call to the Database Connectivity 
Driver. 



environment. 

Allocate storage for information about a 
connection to the Application Database as a data 
source. 



Connects to the Application Database. 



Connects to the Application Database, or 
alternatively, prompt the user for more 
information. 



Iteratively prompts for Application Database 
connection string. 



Disconnects from the Application Database. 



Free connection handle of Application Da t a b ase. 



Frees the driver's environment handle. 



Executes the SQL statement associated with 
hstmt. 

Executes the SQL statement passed in the 
parameter szSqlStr. 



Connects to the Keyword Library 13 of Query 
Facility 63. Prompts the user for Keyword Library 
13 password Finally, connects to the Application 
Database by making the same call to the Database 
Connectivity Driver. 



Iteratively prompts for Keyword Library 13 
connection string. 



Disconnects from the Keyword Library 13 and 

make same call to Database Connectivity Driver 

to disconnect from database. 

Free Keyword Library 13 connection ha&dk, and 

makes same call to Database Connectivity Driver 

to free database correction handle. 

Frees (he driver's environment handle, and then 

makes same call to Database Coaiectrvity Driver 

to free its environment handle. 

Executes the SQL statement associated with hstmt 

Parses the SQL statement to extract the items or 
columns selected by the user and then calls the 
Inference Engine 17 of the Query Facility 65 to 
cctcrmine the access path through the Application 
Database using the selected items. 
After the Program Generator 18 has generated an 
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TAB LE 3 -continued 

Comparison of Implementation of ODBC Calls Between a D atabase Connectivity Driver and the Query FaciHty Connectivity Dfiv*r 
7j Database Connectivity Driver Query Facility Connectivity Driver 



RETCODESQL_APISQLNativeSql( 
HDBC hdbcParam, 
UCHAR FAR •aSqlStria 
SDWORD cbSqlStrln, 
UCHAR FAR •szSqlStr, 
SDWORD cbSqlStxMax, 
SDWORD FAR •pcbSqlStr) 



RETCODESQL_AFISQLParamData( 

HSTMT hstmt, 

PTR FAR •prgb\Wue) 

RETCODESQL _jVPTSQLPutData( 

HSTMT hstmt. 

PTR rgb Value, 

SDWORD cbVatue) 

RETCODESQL_APlSQLS«tParam< 

HSTMT hstmt, 

UWORD jpar, 

SWORD fCTypc, 

SWORD fSqJType, 

UDWORD cbColDef, 

SWORD ibScak, 

PTR rgbValue, 

SDWORD FAR •pcbValue) 

RETCODESX)L_APISOiJ)escribeParam( 

HSTMT hstmt, 

UWORD ipax, 

SWORD FAR •peSqnype, 

UDWORD FAR •pcbColDef, 

SWORD FAR •pibScale, 

SWORD FAR •p&VuUable) 

RETCODESQlLaPIS QLParamOpti ons( 

HSTMT hstmt, 

UDWORD crow, 

UDWORD FAR »puw) 

RETCODESQL_APlSQLNumParams( 

HSTMT hstmt, 

SWORD FAR •pepar) 

RETCODESQL_APlSQIjGednfo( 

HDBC hdhcParam, 

UWORD finfoTypc, 

PTR rgblnfbVahie, 

SWORD cblnfoValueMax, 

SWORD FAR •pcbInfo\»lue) 

RECTCODB SQl^APISQLGetTypelnftK 

HSTMT hstmt, 

SWORD fSoJType) 

RHTCODfiSQL_J^PISQljOetFuiictiojas( 

HDBC bdbcPanan, 

UWORD Function, 

UWORD FAR •pffixists) 

RETCODESQL_APISQI^tConDectOption( 

HDBC hdbcParam, 

UWORD fOption. 

UDWORD vParam) 

R£TCODESQL_APlSQLSctStmtC)ption< 

HSTMT hstmt, 

UWORD fOptioa, 

UDWORD vParam) 

RBTCODE5 QU_AP1S QU3etConnectOptioa( 

HDBC hdbcParam, 

UWORD fOption, 

PTR pvParam) 

RETCODESQL_APISQLOctStmrOption( 
HSTMT hstmt, 
UWORD rDption, 
PTR pvParan) 

RFIXX>DESQL_APTSQLAllocStmt( 



Translates the SQL statement to the SQL dialect 
of the DBMS 63 used by the Application 
Database. 



Return required information, either about itself or 
about the Application Database. 



Retrieve information regarding functions 
supported in the drievr 



Set connection options. 



Set ffa ^ ii 1 ""* options. 



Oct connect kin options. 



Get statement options, 



Allocate SQL statement handle. 



SQL source program for the access path, the driver 
then makes a SQLExecDtreci call to the Database 
Connectivity Driver to execute the SQL source 
program and to return the query results to the user. 
Parses the SQL statement it receives to extract the 
items or columns selected by the user and then 
calls the Inference Engine 17 of the Query Facility 
65 to determine the access path through the 
Application Database using the selected items. 
After the Program Generator 18 has generated the 
SQL source program for the access path, the driver 
translates this SQL source program to the SQL 
dialect of the DBMS 63 used by the Application 
Database. 

Pass call directly to Database Connectivity Driver. 



Pass call directly to database Connectivity driver. 



Pass call directly to Database Connectivity Driver. 



Pass call directly to Database Connectivity Driver, 



Passcall directly to Database Connectivity Driver. 



Pass call directly to Database Connectivity Driver. 

Return required mformalioo, either about itself, 
about the Database Connectivity Driver, or about 
the Application Database. 



Pass call to Database Connectivity Driver. But 
return result is modified to reflect functions that 
may not apply to the Query Facility e.g. Query 
Facility does not support create or update 
functions. 

Pass call to Database Connectivity Driver. 



Pass call to Database Cc^mectivity Driver. 



Pass call to Database Connectivity Driver. 



Pass call to Database Connectivity Driver. 



Allocate SQL statement data structures and call 



Functions to support parameters. 
Functions to support parameters. 



Functions to support parameters. 



Functions to support parameters. 



Functions to support parameters. 



Functions to support parameters. 



Retrieve information regarding datatype supported Pass call to Database Connectivity Driver, 
by Application Database and driver 
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TABLE 3 -continued 



Comparison of Implementation of ODBC Calls Between a Database Conoectivitv Driver and the Query Facility Connectivity Driver 



ODBC API 



Database Connectivity Driver 



Query Facility Connectivity Driver 



HDBC hdbcParam, 
HSTMT FAR ♦phstmt) 
RETCODESQL_APISQD : reeStmt( 
HSTMT hstmt, 
UWORD fDption) 
RBTCODESQL_APlSQLPrepwe( 
HSTMT hstint, 
UCHAR FAR •szSqlStr, 
SDWORD cbSqlStr) 



RETCODESQL_APISQIjCancel< 
HSTMT hstmt) 

RBTCODESQL_APISQLSetScroUOptiaiis( 

HSTMT hstmt, 

UWORD fCoraurrerxy, 

SDWORD crowKeyset, 

UWORD crowRowset) 

RETCX)DESQU - APlSQLSetCursorNarae< 

HSTMT hstmt, 

UCHAR FAR •szCursor, 

SWORD cbCursor) 

RETCODESQL_APISQLGetCur80TName( 
HSTMT hstmt, 
UCHAR FAR ♦szCursor, 
SWORD cbCuiiOfMkj., 
SWORD FAR *pcbCu»or) 
RETCODESQU_APISQLNuniRcsultCob( 
HSTMT hstmt, 
SWORD DAR *pccol) 

RETCODESQU_APISQLDescTibeCo]( 

HSTMT hstmt, 

UWORD icol, 

UCHAR FAR *szCoIName, 

SWORD cbCottfameMax, 

SWORD FAR ♦pcbCoIName, 

SWORD FAR •(pfSqnype, 

UDWORD FAR ♦pcbCofDef, 

SWORD FAR ♦pibScak, 

SWORD FAR *pfKullable) 

RETCODESQL_JVPISQlArtributes( 

HSTMT hstmt, 

UWORD icol, 

UWORD fDescType, 

PTR rgbDeac, 

SWORD cbDescMax, 

SWORD FAR •pcbDesc, 

SDWORD FAR *pflDeac) 



RETCODESQL_APISQLBindCol( 

HSTMT hstmt, 

UWORD icoL, 

SWORD fCTypc, 

PTR rgb Value, 

SDWORD cbVahicMax, 

SDWORD FAR ♦peb Value) 

RETCODESQL_APlSQLHetch< 

HSTMT hstmt 



RETCODESQL_APESQljGetDkta( 

HSTMT hstmt, 

UWORD icoL 

SWORD fCiypc, 

PTR rgbVklue, 

SDWORD cbVahieMax, 

SDWORD FAR *pcbValue) 



Free SQL statement handle 

Prepare an SQL statement for execution 



Cancels execution of a SQL statement or 

command. 

Obselete 



SQLAHocScmt in the Database Connectivity 
Driver. 

Free SQL statement data structures and make same 
call to Database Connectivity Driver to free its 
SQL statement handle 

Parses the SQL statement to extract the hems or 
columns selected by the user and then calls the 
Inference Engine 17 to determine the access path 
through the Application Database using the 
selected items. 

After the Program Generator 18 has generated an 
SQL source program for the access path driver 
then makes a SQLPrepare call (o the Database 
Connectivity Driver to prepare the SQL source 
program. 

Pass call to the Database Connectivity Driver. 
Obselete 



Associate a cursor name with a hstmt 



Associate a cursor name with a hstxnt If the hstmt 
is also associated with the Database Connectivity 
Driver^eg, in the case of SQLExecute), then call 
the equivalent function in the Database 
Connectivity Driver. 

Return the cursor name for a statement handle Return the cursor name for a statement handle If 

the hstmt is also associated with the Database 
Connectivity Driver(eg t in the case of 
SQLExecute), then call the equivalent' function in 
the Database Connectivity Driver. 
Returns the number of cohmms m the result set. Returns the number of columns in the result set. If 

the hstmt is associated with the Database 
Connectivity Driver, then make the same call to 
the Database Connectivity Driver. 
Return information about the column of a table in Return inronnabcm about the selected keyword 
the Application Database the user wants from die Keyword Library 13 the user wants 

mfonnatkn about. information about. This keyword is first mapped 

to an actual database column of the Application 
Database by the driver which then issues the same 
call to the database Connectivity Driver 62 to get 
infonnation about this database column. After it 
gets this information, the Oliver returns this 
mfonnauon. 

Returns descriptor information about the database Returns descriptor information about the selected 
column of a table in the Application Database the keyword from the Keyword Library 13 the user 



user wants information about 



Associate a result buffer with a column in the 
result set. 



Returns data for bound columns in the current 
row, and advances the cursor. 



Returns result data for a single column in t 
current row. 



wants information about This keyword is first 
mapped to an actual darabaw column of the 
Application Database by the driver which then 
issues the same call to the Database Connectivity 
Driver 62 to get the descriptor ©formation about 
this database column. 

After it gets this descriptor information, the driver 

returns mis xnformattoo. 

Associate a result with a column in the 

result set 

If the hstmt is awncaitrd with the data b ase driver, 
then make the same function call to the Database 
Connectivity Driver. 



Returns data for bound whim™* in the current 
row, and advance the cursor. 
If the hstmt is associated with the Database 
Connectivity Driver, then call the equivalent 
function in the Database Connectivity Driver. 
Returns result data for a single column in the 
current row. 

If the hstmt is associated with the Database 
Connectivity Driver, then call the equivalent 
function in the Database Connectivity Driver. 
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TABLE 3 -continued 



Comparison of Implementation of ODBC Calls Between a Database Connectivity Driver and the Quay Facility Connectivity Driver 



ODBC API 



Database Connectivity Driver 



Query Facility Connectivity Driver 



RETCODESQU_APlSQLMoreResults( 
HSTMT hstml) 



RETCODESQL_ J AFE5QLRowCount( 
HSTMT hstmt, 
SDWORD FAR ♦pcrow) 



RETCODESQL_APISQLSetPo6( 
HSTMT hstmt, 
UWORD irow, 
UWORD fOption, 
UWORD flx>ck) 

RETCODESQL_APISQLExterxlodFetch( 

HSTMT hstmt, 

UWORD ffetchType, 

SDWORD irow, 

UDWORD FAR »pctow, 

UWORD FAR *rgfRowStarus) 

RBTCODESQL_APISQLEnor( 

HENV henv, 

HDBC hdbcPiram, 

HSTMT hstml, 

UCHAK FAR •B2SqlState, 

SDWORD FAR •pfNativeEnor, 

UCHAR FAR •szErrorMsg, 

SWORD cbErrorMsgMax, 

SWORD FAR •pcbErrorMsg) 

KEfCODESQL-AKiSQLi ransact( 

HENV benv, 

HDBC hdbcParam, 

UWORD fType) 



Checks if more information is available for hstmL Checks if more information is available for bstmt. 

If the hstmt is associated with the Database 
Connectivity Driver, then call the equivalent 
function in the Database Connectivity Driver 1. 
Returns the number ofrows attached to "hstmt" 
Hstmt may be associated with the Application 
Database or the Keyword Library 13. 
If the hstmt is associsted with the Databse 
Connectivity Driver, then call the equivalent 
function u the Database Connectivity Driver. 
Sets position of cursor. 
If the hstmt is associated with the Database 
Connectivity Driver, then call the equivalent 
function in the Database Connectivity Driver. 

Not implemented ODBCDLL provides a default 
implementation. 



Returns the number of rows associated with the 
Application Database attached to "hstmt" . 



Sets position of cursor. 



Fetches data. 



Returns the most recent error. 



Performs commit or rollback 



Retms the most recent error. 

If the heov, hdbcParam, or hstmt is associated with 

the Database ConnectivUy Driver, then call the 

equivalent function in the Database Connectivity 

Driver. 



Dvvb jjuiiiiftg SUiC© COumiitf ml*" 
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What is claimed is: 

1. An end user query facility for accessing a database 
having a plurality of database files formed using a database 
model, comprising: 

a knowledge base which stores a set of linkages of the 
database model, each said linkage representing a rela- 
tion between two of said database files in which a first 
file has a key that references an equivalent key of a 
second file; 

a semantics extractor for reading said database model and 
extracting the semantics of said database model, and 
which stores in said knowledge base said set of link- 
ages; 

an application, which supports end-user query, to obtain ^ 
from a user a designation of the information to be 
extracted from said database; 

a query facility connectivity driver far connecting said 
application to said query facility, and for connecting 
said query facility to said database through a database 53 
connectivity driver, 

an inference engine which, based upon said designation of 
information to be extracted from said database, iden- 
tifies one or more of said database files which contain 
the desired information and searches said knowledge 60 
base to determine the linkages) connecting said one or 
more identified files; and 

a program generator which accesses the linkages obtained 
by said inference engine and generates a program to 
extract said desired information from said database. 

2. An end-user query facility as in claim 1 wherein said 
inference engine comprises: 



65 



means for inferring new acquired knowledge threads and 
storing said new acquired knowledge threads in said 
knowledge base; 

means for detennining an access path to said database 
using either said basic knowledge threads or said 
acquired knowledge threads as required to meet the 
user query. 

3. An end-user query facility as in claim 2 wherein said 
acquired knowledge thread Is derived through a combination 
in parallel of two or more of said basic knowledge threads 
such that one of the said basic knowledge threads has one or 
more of its consecutive files in common with the corre- 
sponding number of consecutive files starting from the 
thread head of another said basic knowledge threads. 

4. An end-user query facility as in claim 3 wherein said 
thread head is a first file on a said basic knowledge thread. 

5. An end-user query facility as in claim 1 wherein said 
semantics extractor further comprises means for reading 
source code of application programs that access said 
database, extracting the semantics of said application pro- 
grams and storing in said knowledge base said set of 
linkages. 

6. An end user query facility for accessing a database 
having a plurality of database files formed using a database 
model comprising: 

a knowledge base which stores a set of classes and a set 
of linkages of the database model, each said class 
represents a m'erarchical grouping of a subset of said 
database files, each said linkage representing a relation 
between two of said database files in which a first file 
has a key that references an equivalent key of a second 
file; 
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a class generator for reading said database model and 
generating said set of classes and said set of linkages of 
the database model and which stores in said knowledge 
base said set of classes and said set of linkages; 

an application, which supports end-user query, to obtain 
from an user choices based on said classes as a desig- 
nation of the information to be extracted from said 
database; 

a query facility connectivity driver for connecting said 
application to said query facility, and for connecting 
said query facility to said database through a database 
connectivity driver; 

an inference engine which, based upon said designation of 
information to be extracted from said database, iden- 
tifies one or more of said database files which contain 
the desired information and searches said knowledge 
base to determine the linkages) connecting said one or 
more identified files; and 

a program generator which accesses the linkages obtained 
by said inference engine and generates a program to 
extract said desired information from said database. 

7. An end user query facility for accessing a database 
having a plurality of database files formed using a database 
model, comprising: 

a knowledge base which stores a set of classes and a set 
of linkages of the database model; each said class 
represents a hierarchical grouping of a subset of said 
database files, each said linkage representing a relation 
between two of said database files in which a first file 
has a key that references an equivalent key of a second 
file; 

a class generator for reading said database model and 
generating said set of classes and said set of linkages of 
the database model and which stores in said knowledge 
base said set of classes and said set of linkages; 

an application, which supports end-user query, to obtain 
from a user a designation of the information to be 
extracted from said database; 

a query facility connectivity driver for connecting said 
application to said query facility, and for connecting 
said query facility to said database through a database 
connectivity driver; 

an inference engine which, based upon said designation of 
information to be extracted from said database, iden- 
tifies one or more of said database files which contain 
the desired information and searches said knowledge 
base to determine the linkage(s) connecting said one or 
more identified files; 

a knowledge thread analyzer receiving as its input said 
Unkage(s) determined by said inference engine, and 
which breaks down said linkage(s) into simple linkage 
(s); and 

a program generator which accesses said simple linkages 
obtained by said knowledge analyzer and generates 
programs, one for each said simple linkage, to extract 
said desired information from said database as a plu- 
rality of simple query results, one for each generated 
program. 

8. An end-user query facility as in claim 7 wherein a said 
simple query result is one that does not contain repeated 
numeric values in any of the column(s) that corresponds to 
a non-key numeric item of said database. 

9. An end-user query facility as in claim 7 wherein a said 
simple query result is one that does not contain two or more 
columns belonging to items whose database files are multi- 
valued dependencies of another database file in said data- 
base. 
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10. An end user query facility for accessing a database 
having a plurality of database files formed using a database 
model, comprising: 

a knowledge base which stores a set of linkages of the 
database model each said linkage representing a rela- 
tion between two of said database files in which a first 
file has a key that references an equivalent key of a 
second file; 

a semantics extractor for reading said database model and 
extracting the semantics of said database model, and 
which stores in said knowledge base said set of link- 
ages; 

a keyword library which stores a set of keywords of said 

database model; 
an application, which supports end-user query, to obtain 

from a user a designation of the information to be 

extracted from said database using said keyword 

library; 

a first connectivity driver for connecting said application 
to said keyword library; 

an inference engine which, based upon said designation of 
information to be extracted from said database, iden- 
tifies one or more of said database files which contain 
the desired, information and searches said knowledge 
base to determine the linkages) connecting said one or 
more identified files; 

a first e-mail agent for said first connectivity driver to 
interface with an e-mail system, said first e-mail agent 
being used to post said designation of information to be 
extracted from said database to the mailbox of said 
inference engine; 

a second e-mail agent for the said inference engine to 
interface with said e-mail system, the said second 
e-mail agent being used to access said mailbox of said 
inference engine to obtain said designation of informa- 
tion to be extracted from said database; 

a program generator which accesses the linkages obtained 
by said inference engine and generates a program to 
extract said desired information from said database. 

11. An end-user query facility as in claims 1. 6 or 7 
wherein both said application and said query facility con- 
nectivity driver have the same data access interface that 
conforms to a data access interface standard. 

1Z An end user query facility as in claims 1, 6 or 7 
wherein both said query facility connectivity driver and said 
database connectivity driver have the same data access 
interface that conforms to a data access interface standard. 

13. An end user query facility as in claims 11 or 12 
wherein said data access interface standard is Microsoft 
Corporation's Open Database Connectivity (ODBC) stan- 
dard. 

14. An end user query facility as in claim 11 or 12 wherein 
said data access interface standard is Apple Computer's Data 
Access Language (DAL) standard. 

15. An end user query facility as in claims 1 or 10 wherein 
said semantics extractor comprises means for deriving basic 
knowledge threads comprising a set of linkages of said 
database model and storing said knowledge threads in said 
knowledge base. 

16. An end-user query facility as in claim 15 wherein each 
said basic knowledge thread comprises a set of two or more 
of said database files serially linked together such that one 
file is serially linked to the next file through an item that has 
the same domain as the item in the next file and that this 
same item is a unique or repeating key of the next file. 

17. An end-user query facility as in claim 15 wherein each 
said basic knowledge thread comprises a set of two or more 
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of said database files serially linked together such that one 

file has a repeating or foreign key that references the unique 

or primary key of the next file. 
18. An end user query facility for accessing a database 

having a plurality of database files farmed using a database 5 

model, comprising: 
a knowledge base which stores a set of classes and a set 
of linkages of the database model, each said class 
represents a hierarchical grouping of a subset of said 
database files, each said linkage representing a relation io 
between two of said database files in which a first file 
has a key that references an equivalent key of a second 
file; 

a class generator for reading said database model and 
generating said set of classes and said set of linkages of 15 
the database model and which stores in said knowledge 
base said set of classes and said set of linkages; 

a keyword library which stores a set of keywords of said 
set of classes including the names of said classes and 
their attributes; 20 

an application, which supports end-user query, to obtain 
from a user a designation of the Information to be 
extracted from said database using said keyword 
library; 

a first connectivity driver for connecting said application 
to said keyword library; 

an inference engine which, based upon said designation of 
information to be extracted form said database, iden- 
tifies one or more of said database files which contain ^ 
the desired information and searches said knowledge 
base to determine the linkage(s) connecting said one or 
more identified files; 

a first e-mail agent for said first connectivity driver to 
interface with an e-mail system* said first e-mail agent 35 
being used to post said designation of information to be 
extracted from said database to the mailbox of said 
inference engine; 

a second e-mail agent for the said inference engine to 
interface with said e-mail system, the said second ^ 
e-mail agent being used to access said mailbox of said 
inference engine to obtain said designation of informa- 
tion to be extracted from said database; and 

a program generator which accesses the linkages obtained 
by said inference engine and generates a program to 45 
extract said desired information from said database. 

19. An end user query facility for accessing a database 
having a plurality of database files formed using a database 
model, comprising: 

a knowledge base which stores a set of classes and a set ^ 
of linkages of the database model, each said class 
represents a hierarchical grouping of a subset of said 
database files, each said linkage representing a relation 
between two of said database files in which a first file 
has a key that references an equivalent key of a second 55 
file; 

a class generator far reading said database model and 
generating said set of classes and said set of linkages of 
the database model and which stores in said knowledge 
base said set of classes and said set of linkages; $o 

a keyword library which stores a set of keywords of said 
set of classes including the names of said classes and 
their attributes; 

an application, which supports end-user query, to obtain 
from a user a designation of the information to be 65 
extracted from said database using said keyword 
library; 



a first connectivity driver for connecting said application 
to said keyword library; 

an inference engine which, based upon said designation of 
information to be extracted form said database, iden- 
tifies one or more of said database files which contain 
the desired information and searches said knowledge 
base to determine the linkage(s) connecting said one or 
more identified files: 

a first e-mail agent for said first connectivity driver to 
interface with an e-mail system, said first e-mail agent 
being used to post said designation of information to be 
extracted from said database to the mailbox of said 
inference engine; 

a second e-mail agent for the said inference engine to 
interface with said e-mail system, the said second 
e-mail agent being used to access said mailbox of said 
inference engine to obtain said designation of informa- 
tion to be extracted from said database; 

a knowledge thread analyzer receiving as its input said 
linkage(s) determined by said inference engine, and 
which breaks down said linkage(s) into simple linkage 
(s); and 

a program generator which accesses said simple linkages 
obtained by said knowledge analyzer and generates 
programs, one for each said simple linkage, to extract 
said desired information from said database as a plu- 
rality of simple query results, one for each generated 
program. 

20. An end user query facility as in claims 1, 6, 7, 10, 18 
or 19 wherein said application comprises : 

means for obtaining from a user a user supplied key word 
indicative of the data desired to be extracted from said 
database; 

means for obtaining from a user a user supplied key word 
indicative of the data desired to be extracted from said 
database; 

means for determining all keywords in said database 
model having a predefined relationship with said user 
supplied keyword; and 

means for causing a user to select one or more of said 
keywords having a predefined relationship, said one or 
more keywords thus selected serving as said designa- 
tion of the information to be extracted from said 
database. 

21. An end user query facility as in claims 6, 7, 18 or 19 
wherein each said class comprises a data model of a subset 
of database files of said database model, the said data model 
comprising a tree. 

22. An end user query facility as in claim 21 wherein said 
tree of said class has a root comprising a database file of 
kernel entity type and a plurality of branches each compris- 
ing a Linkage of one or more database files from said subset 
of database files. 

23. An end-user query facility as in claim 22 wherein said 
class generator comprises: 

a semantics extractor to derive a set of binary relation- 
ships and to derive the entity type of each database file 
of said database model; 

a means to derive said set of classes using said set of 
binary relationships and said entity type of each data- 
base file of said database model. 

24. An end-user query facility as in claim 23 wherein said 
entity type is selected from a set of entity types which 
comprise a kernel entity type, a subtype entity type, a 
characteristic entity type, an associative entity type or a pure 
lookup entity type. 
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25. An end-user query facility as in claim 23 wherein said 40. An end-user query facility as in claim 39 wherein said 
binary relationship comprises a linkage between two files in class generator comprises: 

said database model in which one file has a repeating or a semantics extractor to derive a set of binary relation- 
foreign key that references the unique or primary key of the ^ mdXo identify me efltity ^ of Mch ^base mc 
other rile. 5 . . , odel* 

26. An end user query facility as in claim 25 wherein said em 

set of binary relationships comprises "has_children*\ "has_ a means to derive said set of classes and their said set of 
wards", "inverse_of_pure_Jookup w or "has_subtype w relationship names using said set of binary relation- 
binary relationships. ships and said entity type of each database file of said 

27. An end user query facility as in claim 26 wherein said database model. 

set of binary relationships further comprises "inverse_of_ 10 41 An cn d-user query facility as in claim 40 wherein said 

has__children". "mverse_af__has_wards'\ lt pure_lookup" class gcncrator further comprises a means for a user to 

or "inverse_of_has_subtype M binary relationships. modify said relationship names in the said set of classes. 

28. An end user query facility as in claim 27 wherein said 42 . An en( i-user query facility as in claim 6, 7. 18 or 19 
^nverse_of_Jws_children M binary relationship is one in wherein said application comprises an interface that displays 
which the target file is not a pure_lookup entity type and has J3 ^ or selected said classes and that allows a user to formulate 
a one-to-many relationship with the source file, and the a query by pickmg me desired dass attributes from said class 
source file does not have its own independent unique or as a designation of the information to be extracted from said 
Primary kc y- database. 

29. An end user query facility as in claim 27 herein said 43 ^ en d-user query facility as in claims 1, 6. 7, 10. 18 
"inverse_of_has_wards" binary relationship is one which or 19 which further comprises a model purifier for a user to 
the target file is not a pure lookup entity type and has a ^ ler said database model by defining new keys for database 
one-to-many relationship with the source file, and the source f^ cs or ^ altering existing keys of database files of said 
file has its own independent unique or primary key. database model. 

30. An end user query facility as in claim 27 wherein said 44 An end-user query facility as in claim 43 wherein said 
M inverse_of_has_subtype" binary relationship is one in keys comprise unique or repeating keys. 

which Us target file has a one-to-one relationship with its 25 45. An end-user query facility as in claim 43 wherein said 

source file and the source file is a subtype entity of the target comprise primary or foreign keys. 

file* 46. An end-user query facility as in claims 1. 6, 7, 10. 18 

31. An end-user query facility as in claim 27 wherein said or 19 which further comprises a model purifier for a user to 
**pure_lookup" binary relationship is a binary relationship alter said database model by defining new item(s), new 
whose target file is a pure lookup entity. 30 key(s) or new file(s) using items of database files of said 

32. An end user query facility as in claim 26 wherein said database model. 

, nias_children' 1 binary relationship is one in which the 47. An end-user query facility as in claims 1.6, 7, 10, 18 

source file is not a pure lookup entity type and has a or 19 further comprises a program analyzer for 

onc-to-many relationship with the target file, and the target analyzing and deriving new item(s), new key(s) or new 

file does not have its own independent unique or primary 35 fiie(s) from source code of applications that access said 

key. database model and for altering said database model using 

33. An end user query faculty as in claim 26 wherein said said ncw item(s), new key(s) or new file(s). 
"has_wards H binary relationship is one in which its source 43. ^ end-user query facility as in claims 46 or 47 
file is not a pure lookup entity type and has a one-to-many wherein said program generator comprises means to gener- 
relationship with its target file, .and its target file has its own ^ ate source program that produce data from said new items of 
independent unique key or primary key. sa jd ne w files and wherein said end-user query facility 

34. An end user query facility as in claim 26 wherein said further comprises a compiler for compiling said source 
'*has_subtype" binary relationship is one which its source program- 
file has a one-to-one relationship with its target file and the 49 An e nd-user query facility as in claim 43. 46 or 47 
target file is a subtype entity of the source file. wherein said class generator further comprises a means to 

35. An end-user query facility as in claim 26 wherein said « ^ said database model to regenerate said set of 
"inverse_of_pure_Jookup M binary relationship is a binary classes and said set of linkages. 

relationship whose source file is a pure„lookup entity. 5$ An end-user query facility as in claims 1. 6. 7, 10, 18 

36. An end-user query facility as in claim 23 wherein said OT 19 which further comprises a security model specifier to 
semantics extractor further comprises means for deriving a uscr to input a security model which specifies 
basic knowledge threads comprising a set of linkages of said 50 restrictions on the information a user can obtain from said 
database model and storing said basic knowledge threads in database. 

said knowledge base, 51, An end-user query facility as in claim 50 wherein said 

37. An end-user query faculty extractor as in claim 36 security model comprises: 

wherein each saMba^ securing which specifies a set of database files and 

^ 55 items sele ^ d from^d database which a user or class 

such that one file is linked to the next file through an item . 

that has the same domain as the item in the next file and that or uscrs acccss; ana 

this same item is a unique or repeating key of the next file. value security which specifies a set of conditions to 

38. An end-user query facility extractor as in claims 37 restrict said user or class of users to a certain range(s) 
wherein each said basic knowledge thread comprises a set of of values of said database. 

two or more of said database files serially linked together 60 52. An end user query facility as in claims 1. 6, 7, 10. 18 
such that one file has a repeating or foreign key that or 19 wherein said program comprises a source code pro- 
references the unique or primary key of the next file. gram. 

39. An end-user query facility ads in claim 22 wherein 53. An end user query facility as in claim 52 which further 
said class further comprise a set of relationship names each comprises a compiler for compiling said source code pro- 
of which specifies the nature of the relationship between two 65 gram. 

adjacent database files in each branch of said tree of said 54. An end user query facility as in claims 1, 6, 7, 10, 18 

class. or 19 wherein said application comprises: 
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means for obtaining from a user a user supplied keyword 
indicative of the data desired to be extracted from said 
database; 

means for determining all keywords in said database 
model having a predefined relationship with said user 
supplied keyword; and 

means fox causing a user to select one or more of said 
keywords having a predefined relationship, said one or 
mare keywords thus selected serving as said designa- 
tion of the information to be extracted from said 
database. 

55. An end-user query facility as in claims 6, 7, 10, 18 or 
19 wherein said inference engine comprises: 

means for inferring new acquired knowledge threads and 
storing said new acquired knowledge threads in said 
knowledge base; and 

means for determining an access path to said database 
using either said basic knowledge threads or said 
acquired knowledge threads as required to meet the 
user query. 

56. An end-user query facility as in claim 55 wherein said 
acquired knowledge thread is derived through a combination 
in parallel of two or more of said basic knowledge threads 
such that one of the said basic knowledge threads has one or 
more of its consecutive files in common with the corre- 
sponding number of consecutive files starting from the 
thread head of another of said basic knowledge threads. 

57. An end-user query facility as in claims 56 wherein 
said thread head is a first file on a said basic knowledge 
thread. 

58. An end-user query facility as in claims 1, 6, 7. 10. 18 
or 19 wherein said knowledge base comprises pre-created 
knowledge base. 

59. An end-user query facility as in claims 1, 6, 7. 10, 18 
or 19 wherein said knowledge base comprises a run-time 
created knowledge base. 

60. An end-user query facility as in claims 1, 6, 7, 10. 18 
or 19 wherein said knowledge base comprises a persistent 
knowledge base. 

61. An end-user query facility as in claims 1, 6. 7, 10, 18 
or 19 wherein said knowledge base comprises a transient 
knowledge base. a 

62. An end-user query facility as in claims 1, 6. 7, 10, IB 
or 19 wherein said knowledge base is implemented in data 
dictionary of said database. 

63. An end-user query facility as in claims 1, 6, 7, 10. 18 
or 19 wherein said knowledge base is implemented in 
system catalog of said database. 

64. An end-user query facility as in claims 10, 18 or 19 
which further comprises a second connectivity driver for 
connecting said program generator to said database through 
a database connectivity driver. 

65. An end-user query facility as in claims 10. 18 or 19 
wherein said second e-mail agent further comprises means 
to post said desired information extracted from said database 
to the user mailbox. 

66. An end-user query facility as in claims 18 or 19 which 
further comprises a second connectivity driver for connect- 
ing said program generator to said database through a 
database connectivity driver. 

67. An end user query facility as in claim 66 wherein both 
said second connectivity driver and said database connec- 
tivity driver have the same data access interface that con- 
forms to a data access interface standard. 

68. An end user query facility as in claims 70 or 67 
wherein said data access interface standard is Microsoft 
Corporation's Open Database Connectivity (ODBC) stan- 
dard. 

69. An end user query facility as in claims 70 or 67 
wherein said data access interface standard is Apple Com- 
puter Incorporated' s Data Access Language (DAL) stan- 
dard. 
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70. An end-user query facility as in claims 18 or 19 
wherein both said application and said first connectivity 
driver have the same data access interface that conforms to 
a data access interface standard. 

71. An end user query facility far accessing an existing 
database having a plurality of database files formed using a 
database model comprising: 

an entity relationship (ER) model generator for reading 
said databases and deriving a plurality of entity- 
relationship models, said ER model generator compris- 
ing: 

an entity type classifier for classifying each said data- 
base file into one of a plurality of entity types, said 
entity types including a "kernel" entity type, a "sub- 
type" entity type, a "characteristic" entity type, an 
"assodative" entity type, and a ''pure lookup" entity 
type; 

a binary relationship generator for generating a plural- 
ity of binary relationships between said database 
files, each said binary relationship associated with a 
linkage representing a relation between a first file 
having a key that references an equivalent key of a 
second file, said binary relationships including: 
a "has_children" type that represents a first database 
file that is not classified as a "pure lookup" entity 
type and has a one-to-many relationship with a 
second database file, and said second database file 
does not have a unique key; 
a "has_wards" type that represents a first database 
file that is not of a ^ure lookup" entity type and 
has a one-to-many relationship with a second 
database file having a unique key; 
a 4< has_subtype" type that represents a first database 
file having a one-to-one relationship with a second 
database file and said second database file is 
classified as a "subtype" entity type; and 
an "inverse__of _pure_l ookup" mat represents a first 
database file is classified as a 4 *purc_lookup" 
entity type; 

a model constructor for constructing said entity- 
relationship models, each said entity-relationship 
model representing a tree having a root and a plurality 
of branches, said root associated with one of said 
database files classified as a "kernel" entity type, said 
model constructor utilizing said entity types and said 
binary relationships to associate one or more branches 
with each said entity-relationship model; 
a knowledge base that stores said ER models; 
an application for interfacing with a user to obtain from 
said user choices based on said entity-relationship 
model as a designation of the information to be 
extracted from said database; 
an inference engine mat. based upon said designation of 
information to be extracted from said database, iden- 
tifies one or more of said database files that contain the 
desired information and searches said knowledge base 
to determine one or more linkages connecting said 
identified files; 
a program generator that accesses said linkages obtained 
by said inference engine and generates a program to 
extract said desired information from said database; 
and 

a query facility connectivity driver that comprises: 
a first connectivity interface coupled to said application 
that conforms to Microsoft open database connec- 
tivity (ODBC) standard to interface with said infer- 
ence engine and said knowledge base; and 
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a second connectivity interface coupled to said infer- 
ence engine and said knowledge base to interface 
with said existing database using Microsoft open 
database connectivity (ODBC) driver, 

72. An end user query facility as in claim 71 wherein said s 
binary relationships further comprises: 

an "inverse_of_has_children n type that represents a 
second database file that is not classified as a "pure_ 
lookup" entity type and has a one-to-many relationship 
with a first database file having a unique key; ic 

an "mverse_of_Jias_wards* 1 type that represents a first 
database file that is not classified as a "pure lookup** 
entity type and has a one-to-many relationship with a 
second database file having a unique key; 

a "pure_iookijp" type wherein said second database file 
is classified as a "pure lookup" entity type; and 

an "inverse_of_Jias_subtype t * type wherein said second 
database file has a one-to-one relationship with said 
first database file, and said first database file is classi- 2Q 
fied as a "subtype** entity type. 

73. An end-user query facility as in claim 71 wherein said 
information scout comprises an interface that displays all or 
selected ones of said entity-relationship models and that 
allows a user to formulate a query by picking desired ^ 
attributes from said entity-relationship model as a designa- 
tion of the information to be extracted from said database. 

74. An end-user query facility as in claim 71 which further 
comprises a model purifier for a user to alter said database 
model by defining new keys for database files or by altering 
existing keys of said database files. 30 

75. An end-user query facility as in claim 71 which further 
comprises a model purifier for a user to alter said database 
model by defining one or more new items, one or more new 
keys or one or more new files using items of said database 
files. 



76. An end-user query facility as in claim 71 which further 
comprises a program analyzer for analyzing and deriving 
one or more new items, one or more new keys or one or more 
new files from source code of applications that access said 
database model, and for altering said database model using 
one or more of said new items, one or more of said new keys, 
or one or more of said new files. 

77. An end-user query facility as in claims 75 or 76 
wherein said program generator comprises means to gener- 
ate a source program that produce data from said new items 
of said new files and wherein said end-user query facility 
further comprises a compiler for compiling said source 
program, 

78. An end-user query facility as in claim 71 which further 
comprises a security model specifier to allow a user to input 
a security model which specifies restrictions on the infor- 
mation a user can obtain from said existing database. 

79. An end- user query facility as in claim 78 wherein said 
security model comprises: 

item security which specifies a set of database files and 

items selected from said database which a user or class 

of users can access; and 
value security which specifies a set of conditions to 

restrict said user or class of users to one or more ranges 

of values of said database. 

80. An end user query facility as in claim 71 wherein said 
program comprises a source code program. 

81. An end-user query facility as in claim 80 which further 
comprises a compiler to compile said source code program, 

82. An end-user query facility as in claim 71 which further 
comprises a means for a user to modify relationship names 
in the said set of entity-relationship models. 
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[57] ABSTRACT 

A grammar, parsing method, and associated apparatus for 
automatically generating test commands to test an SQL 
database engine interface while reducing storage require- 
ments and improving access time for such test commands as 
compared with prior test tools. The test tools and methods 
include a grammar for concise syntactic representation of a 
meta-query (also referred to as meta-language statement, 
query pattern, or query template). The meta-query demies an 
statement similar to the SQL language but includes query 
elements and query list elements used to generate a plurality 
of SQL test commands to be applied to the SQL database 
engiuc uuuer Iesu Test commands are generated from the 
meta-query to reduce storage requirements of prior test 
methods. Query elements are variable space holders in the 
meta-query and are replaced by a value appropriate to the 
SQL database engine under test when the meta-query is used 
to generate test commands. Query list elements define a list 
of values to be inserted in place of the query list element 
when generating the test commands from the meta-query. 

9 Claims, 7 Drawing Sheets 
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METHOD AND APPARATUS FOR 
GENERATING DATABASE QUERIES FROM 
A META-QUERY PATTERN 

FIELD OF THE INVENTION 

This invention relates to the testing of database software 
systems and in particular to the testing of database engine 
drivers in an Open DataBase Connection (ODBC) database 
environment by the automatic generation of test commands 
from a meta-query pattern. 

PROBLEM 

The computing structures and methods of the present 
invention are built upon open standard database Application 
Program Interfaces (APIs — also referred to herein as data- 
base management interface means) such as Microsoft's 
ODBC or X/Qpen's DATA MANAGEMENT: SQL CALL 
LEVEL INTERFACE (X/Open Preliminary Specification 
P303 ISBN 1-85912-015-6 - was previously publication 
S203 - will become publication C4S1 available from 
X/Open Company Ltd, Berks, United Kingdom). These 
standards permit client/server database application programs 
to be designed in accord with a common, standardized API 
while utilizing any underlying database engine which con- 
forms to these standards for the permanent physical storage 
of the managed information. End user installations using the 
present invention may therefore utilize any presently 
installed database management subsystem. The SQL 
(Structured Query Language) has been widely adopted as a 
de facto standard interface for the specification of database 
queries (and related data management commands). The 
ODBC API therefore enforces a standardized SQL query 
language and performs any translations necessary for opera- 
tion of a query upon a specific database engine (database 
management subsystem). This hierarchical API structure 
permits the application programmer to adhere to a single 
database/query architecture and yet easily adapt (port) the 
application program to the unique requirements of a par- 
ticular database engine through the ODBC API library 
functions. 

In testing such a standard database API, a test process 
must generate a large number of test commands for each 
database engine supported by the APL For example, a large 
set of test commands is applied to Microsoft's ODBC API 
in order to test its use in conjunction with the dBase database 
engine. Yet another large set of test commands is needed to 
test ODBC when used in conjunction with the Access or 
Paradox database engines, etc. Though there is substantial 
similarity in these plural sets of test commands, there are 
invariably minor differences in syntax or semantics between 
the queries generated for each unique database engine. For 
example, some database engines support atomic data types 
which are unique to the engine. Or, for example, the size 
limits for certain data types may vary among various data- 
base engines. In view of these differences, prior test methods 
and tools for generating test commands for database API 
subsystems have created large sets of test commands and 
commands and stored them in a query database to be 
retrieved when the corresponding database engine is tested 
with the ODBC APL Each test command is "hard-coded" for 
the specific database engine to which it corresponds. The 
query database which stores these commands can therefore 
be quite large. As such, as with any large database, access to 
the database for purposes of extracting test commands to 
perform a particular test sequence can be quite time con- 
suming. Adding, deleting or modifying test commands 
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stored in the large query database can also be time consum- 
ing due to re-indexing operations associated with the 
changes in the query database. 
An additional problem with the query database techniques 

5 taught by prior test products and methods arises from the 
fact that the query database is itself another database which 
must be operated in the same computing platform on which 
the test commands are being applied to the ODBC APL 
Whatever DBMS package is used for the storage of the test 

io commands in the query database must be available on, or 
ported to, the computing platform on which the ODBC/ 
database engine combination is being tested. This porting 
effort may add a substantial workload to the ODBC test 
efforts if the DBMS selected for the query database storage 

15 is not presently available on the computing platform pres- 
ently being tested. 

In view of the above discussions, it is dear that there exists 
a need for methods and apparatus for managing and manipu- 
lating test commands to be used in testing an ODBC/ 

20 database engine combination which improves speed of 
access to the test commands, eases the modification of the 
commands, and reduces the storage requirements for the 
storage of the test commands. 

SOLUTION 

25 

The present invention solves the above identified prob- 
lems and other problems to thereby advance the state of the 
useful arts by providing methods and associated apparatus 
for generating SQL test commands from a query pattern 

30 (also referred to herein as query template, meta-query, or 
simply meta-language statement). The query pattern is 
formed according to the syntax of a meta-language of the 
present invention to define a set of SQL test commands in a 
concise syntactic statement Each meta-language test com- 

35 mand pattern (a meta-query) is parsed by the methods of the 
present invention to generate all test commands in the set 
defined by the meta-query. The SQL test commands so 
generated are then applied to the database engine under test 
The meta-language of the present invention permits test 

40 commands to be expressed in a concise, compact meta- 
language syntax. Storage and modification of the concise, 
compact meta-queries is simpler, faster, and requires sig- 
nificantly less storage capacity as compared to the prior 
techniques wherein all individual test commands are stored 

45 in a test database. The meta-queries are stored in a standard 
text file and may therefore be accessed or modified by any 
of several well known techniques for viewing and modifying 
text files. 

The meta-language of the present invention expresses the 

50 meta-queries according to the rules of a grammar definition. 
The grammar definition includes "query elements" and 
"query list elements." The query elements serve as variable 
place holders in the SQL test commands specified by the 
meta-query. When the meta-query is processed to generate 

55 test commands, the query element placeholder is replaced by 
a variable value appropriate for the database engine being 
tested. Query list elements provide a list of values to be 
substituted into the generated test commands as each test 
command is generated. When a query list element is speci- 

60 lied in a meta-query, at least one query is generated for each 
element in the query list element If multiple query list 
elements are specified in a meta-query, then a test command 
is generated for each unique combination generated by 
selecting one of the elements in each of the multiple query, 

65 list elements. 

Hie syntax of the meta-language is clearly and completely 
defined by a simple BNF style specification as compared to 
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a complex database structure used by prior methods to store Database API 122 and the database engines 124, 126, and 

and retrieve the set of test commands appropriate to the 128 are identical to those of FIG. 1. Query profiler 200 of 

database engine under test The BNF definition defines the FIG. 2 retrieves and processes the meta-language statements 

rules for construction and generation of meta-language (query templates) from the query templates file 202. Each 

commands the semantic interpretation of which is used to 5 meta-language statement in the query templates file 202 may 

generate a set of SQL test commands. define a plurality of test commands to be generated by the 

query profiler 200. The query templates file 202 is a simple 

BRIEF DESCRIPTION OF THE DRAWINGS text file which may be easily constructed and maintained by 

FIG. 1 is a block diagram of a typical test environment of several well known tools for manipulating text files, 
known in the art for testing a database API (e.g. Microsoft's 10 ^ storage space required to store the query templates file 

ODBC); 202 is significantly reduced as compared to the storage 

FIG. 2 is a block diagram of the database API test for equivalent the test database 130 of FIG. 1. 

environment of the present invention which utilizes a meta- META-LANGUAGE S WTAX AND SEMANTICS: 

language syntax to represent large numbers of test com- ™<\ ™*a-language of the present invention may be 
ma ^ ( ^r 15 viewed as a set of grammatical rules for constructing state- 

„ . L1 , Jt , . ments used by the query profiler 200 of FIG. 2 to generate 

FIG. 3 is a block diagram of a computing environment in test SQL rommands . ^ met a-lan g uage is substantially 

which the test environment of the present invention oper- t0 me well SQL query laQguage with de _ 

ates; ments added to define rules for the construction of actual 
FIG. 4 is a flowchart describing the operation of the query 2 o SQL statements. A typical SQL query command, for 

profiler in accord with the methods of the present invention; example, consists essentially of the following syntax: 

FIG. 5 is a flowchart which depicts addition detail of the ^ „^ , ^ 

■ » . . _„ . r SELECT column FROM tables WHERE condition 

method shown in FIG. 4; 

FIG. 6 is the first half of a flowchart of the reentrant parser where: column is replaced by one or more column names, 

of the present invention which generates test commands 25 tables is replaced by one or more table names, and condition 

from meta-language statements; and is replaced by a logical expression which must evaluate to 

FIG. 7 is the second half of a flowchart of the reentrant ^ row i0 . * SGl ™* d from&e tables. The result 

parser of the present invention which generates test com- of 1 the SQL /1 q " ery 15 a ™T ° f ^J*?*?* 

mands from meta-language statements. column ? * e 5 0WS * Vlrtue e of *"> lo S lcaI 

& ° 30 expression evaluating true for those rows. Some database 

DETAILED DESCRIPTION OF THE engines provide for additional elements to be named in the 

INVENTION Identifying columns or tables or in the logical condition 

OVKRVTFW- expression. For example, parameters which further control 

the search capability in the engine's data management or 

FIG. 1 is a block diagram of an approach to testing the 35 specific limitations or additions relating to types of sup- 
database API (such as Microsoft's ODBC) in conjunction porte(i ^ m frequently added to the features of a specific 
with a chosen database engine. Query profiler 120 generates database engine. Such additional elements are frequently 
test SQL commands and applies the generated test com- unique to the specific database engines supported by the 
mands to the database API 122 to be tested. The commands database APL To thoroughly test a database API (such as 
generated are intended to test the database API 122 for ^ Microsoft's ODBC) requires testing not only the features 
proper operation in conjunction with one of the plurality of common to all supported database engines, but also requires 
database engines 1 through N (124, 126, and 128). In accord me testillg of features unique to each supported database 
with the known methods for implementing query profiler eDgine . Testmg fo^e database engine specific features in 
UO, test database 130 is constructed and maintained to conjunction with the database API requires the creation of a 
contain all possible query commands and associated options 45 i^ge number of specialized command options, 
for the generation of all test SQL queries applicable to all nt present invention defines a meta-language syntax and 
database engines 124, 126, and 128 associated with database grammar which builds upon the syntax of standard SQL 
API commands. The meta-language syntax adds variable ele- 

The precise structure of test database 130 may be specific ments to the SQL command syntax. When parsed" by the 
to each database API 122 or specific to the needs of the 50 query profiler 200 (of FIG. 2) of the present invention, these 
database engines 124, 126, and 128 to be used in conjunction variable elements in the meta-language commands are 
with the API 122. Therefore, the detailed structure of test replaced by actual values and the resultant SQL commands 
database 130 is not relevant to an overall understanding of are thereby generated from the meta-language statements 
the operation of known prior techniques. Tables 100-118 are (without the variable element syntax embedded). The gen- 
intended only as an exemplary database structure to dem- 55 erated SQL commands are then applied to the database API 
onstrate the complexity of prior approaches. The various 122 to test its proper operation in conjunction with one of the 
tables and relationships depicted in FIG. 1 are used to define database engine drivers (124, 126, or 128). The variable 
and store the various commands needed to setup a particular elements of the meta-language statement can specify one or 
ODBC environment for testing a particular ODBC driver, to mare actual values to use in the generation of test SQL 
store the various command options and command eo commands and may therefore compactly represent a large 
parameters, and to store the test commands themselves, volume of generated test SQL commands. Such large vol- 
among other information. The complex of the test database uraes of test SQL commands previously required significant 
grows dramatically as additional options, parameters, con- mass storage capacity and associated complexity to store 
figurations and environments are added to the testing of each and retrieve the several SQL command sets required to test 
ODBC driver. 65 the database API 122 operation. 

FIG. 2 is a block diagram of a query profiler 200 which Processing of the meta-language statements by the query 

utilizes the structures and methods of the present invention. profiler 200 (of FIG. 2) automatically generates the test SQL 
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commands represented by the meta-language statements for 
every data type supported by the specific database engine 
driver under test (124, 126, or 128 of FIG. 2). The query 
profiler 200 detects the data types supported by the database 
engine through standard function calls of the database API 5 
122. Such interface function calls to the API are well known 
to those of ordinary skill in the art and are clearly described 
in the public documentation available with the API (such as 
Microsoft's ODBC database APp. The query profiler 200 
then loops through the processing of the meta-language l0 
statements to generate test SQL commands once for each 
supported data type. 

The variable element of the meta-language adds "query 
dements" to the SQL query command syntax which are 
replaced in generation of the test commands by actual values x s 
appropriate to the database engine (124, 126, and 128 of 
FIG. 1) under test with the database API 122 of FIG. 2. The 
query elements are identified by query element identifiers 
(name strings for example) and are delimited in the meta- 
language statement by angled braces ("<" preceding the ^ 
query element identifier and **>" following the query ele- 
ment identifier). The following Table 1 provides exemplary 
query elements presently contemplated in the best known 
mode of the present invention. One of ordinary skill in the 
art will readily recognize that this list may be extended to ^ 
include other query elements which hold the place of 
language elements in the generated SQL queries and are 
unique to the database engine drivers. 



elements are included in a meta-language statement, then the 
query profiler (200 of FIG. 2) will generate a test command 
for each combination of the elements in all the query list 
elements of the statement 

Query list elements are replaced in generation of the test 
command by the alternate values supplied in the query list 
element when test commands are generated to test the 
database API 122. The query list elements are comma 
separated values delimited by a pair of square braces (a "[" 
preceding the list and a 4 T following the list). The following 
Table 2 provides exemplary query list elements presently 
contemplated in the best known mode of the present inven- 
tion. One of ordinary skill in the art will readily recognize 
that this exemplary list may be extended to include other 
query list elements which hold the place of language ele- 
ments in the generated SQL queries. 

TABLE 2 

Query List Element Replacement Information 

[-<,<=;>;>=, !=, |<, t> Generates eight test commands; one with each 
] of the eight listed logical test (comparison) 

operators (as used in the condition clause) 
[*=,=*] Generates two test commands; one with each 

of two Microsoft SQL Server syntax outer join 
operators 

[-,+,*/,%] Generates five test commands: one with each 
of the five listed arithmetic operators 
[SQL_PATS, Generates two test commands: one with each 
S QL_TIME_STAMP] of the two listed standard data types 



TABLE 1 



Query Element Replacement Informatics 



<quali£er> 



<tableN> 



Toe current qualifier for the database engine driver 
under test (i.e. the ODBC connection option - 
S QL_CURKENT_QUALIFIER) 
Name of a table in test data for the SQL command 
(where N is a number from 1 through the number of 
tables in the test data) 
<columnN> Name of a column in a table in the test data for the 
SQL command (where N is a number from 1 through 
the number of columns in the associated table) 
<alias> Name of an alias for a column in the test data for the 

SQL command 

<data> A constant data value for use in the SQL command 

<cotumn name> Name of generated column (Le. one that doesn't 
currently exist in the created table and used in the 
w ALTER TABLE" queries so that the column name 
won't conflict with an existing column name) 
<cohxmn def > Data type of the <column name> element 



30 The following meta-language statement examples provide 
further clarification of the power and syntax of the meta- 
language for the specification of large numbers of test SQL 
commands. In particular it is to be noted that the meta- 
language may be applied to many SQL commands (not 

35 merely the "SELECT" command). 



SELECT <tablct>.<columnl> FROM <tablefc>,<table2> WHERE 
<table>.<columnl> f= I <,<= ( > J >= f l= 1 I< f ]>] <table2>.<columnl> 



40 



This exemplary meta-language statement generates eight 
queries selecting rows from column 1 of table 1 (in the test 
data) where the column 1 value in table 1 of each row 
compares using the selected one of eight comparison opera- 
45 tors with the same row and column of table 2. The eight 
generated queries are: 



The meta-language (query templates/query patterns) of 
the present invention also includes "query list elements" 
which, when used in a meta-language statement, cause the 
generation of a plurality of SQL commands; one for each 
element in the query element list This feature of the 
meta-language permits the compact representation of a large 
set of test commands in a concise, single, meta-language 
statement This representation of a collection of test com- 
mands is simpler to maintain or modify and requires sig- 
nificantly less storage than the methods employed in the past 
to test a database APL 

A query list element provides a list of alternate values to 
be used in generating test commands from the meta- 
language statement (query template). Each of the alternate 
values in the query list element is used to replace the query 
list element in the generation of one (or more) test com- 
mands. In other words, a query list element that indicates 
four alternate values will generate (at least) four test 
commands, (at least) one each for each of the four alternate 
values in the query list element If multiple query list 



SELECT <tablel>.<columnl> FROM <tablel>,<table2> WHERE 
<tablel>.<columnl> = <table2>.<column> 
50 SELECT <tabkl>.<columnl> FROM <tablel> t <table2> WHERE 
<tablel>.<cohunnl> < <table2>.<columnl> 

SELECT <tabkl>.<tolumol> FROM <cablel>,<table2> WHERE 
<tablel>.<columnl> <= <table2>.<colutnnl> 

SELECT <tabtel>.<columnl> FROM <tablel>,<tablc2> WHERE 
<tablel>.<columnl> > <table2>.<columnl> 
55 SELECT <tablel>.<columnl> FROM <tablel>,<table2> WHERE 
<tablel>.<columnl> >= <table2>.<columnl> 

SELECT <tablel>.<columnl> FROM <tablel>,<table2> WHERE 
<Xablel>.<columnl> != <table2>.<columnl> 

SELECT <tablel>.<fcoluranl> FROM <tablel>,<table2> WHERE 
<tablel>.<colunml> !< <table2>.<columnl> 
$0 SELECT <tablel>.<columnl> FROM <tablel>,<table2> WHERE 
<tablel>.<columnl> !> <teblc2>.<cchimnl> 



In addition, the query profiler 200 of FIG. 2 will generate 
these eight test commands for all data types supported by the 
63 selected database engine driver (124, 126, or 128 of FIG. 2). 
For example, Microsoft Access version 2.0 supports 15 
distinct data types. Therefore, this exemplary meta-language 
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statement generates 8*15 or 120 test commands when engine. Element 406 loads all the query patterns from a text 

testing the database API 122 in conjunction with a Microsoft file in which they are stored. The query patterns are previ- 

Access database engine driver. ously designed by the test engineers to compactly specify 

As a further example, consider: the voluminous test commands required to adequately test 

5 the interface between the database API and a database 

C m™o^ ^ <bM0 1 > ( <C ° l ™ m ^ engine driver module. As discussed above, the query pat- 

terns are written in simple textual form in the syntax of the 

This exemplary meta-language statement generates a new meta-language discussed above. Element 406 serves to read 

table index with a column name appropriate to the data type the text file storing the pre-defined query patterns in prepa- 

currently being processed by the query profiler 200. As to ration for parsing the meta-language statements and gener- 

noted above, Microsoft Access, for example, supports 15 ating the specified SQL commands therein, 

data types and therefore, this meta-language statement gen- Elements 408-422 are repetitively operable for each data 

erates 15 SQL commands when testing the database API 122 type supported by the selected database engine driver. Ele- 

in conjunction with the Microsoft Access database engine ment 408 tests whether the counter variable "N" (indicating 

driver. 15 the number of supported data types) has been decremented 

QUERY PROFILER: to zero. On each iteration of the loop (elements 408-422), 

Query profiler 200 of FIG. 2 is operable on a data element 422 is operable to decrement the counter variable 
processing system to parse the meta-language statements "N." Elements 41^-420 are therefore operable to generate 
and to generate test SQL commands for application to the the test commands specified by all query patterns for a single 
database API 122. FIG. 3 is a block diagram depicting a 20 data type supported by the selected database engine driver, 
typical computing environment in which query profiler 200 Element 410 sets the variable "M" to the number of query 
operates. Data processing system 310 provides the central patterns pre-defined by the test engineers in the text file. In 
processing, memory, and mass storage components for other words, the number of records to be processed in the 
operation of query profiler 200. database API 122, and meta-language file. Each record provides another query 
database engine drivers 124 and 126. Database engine 25 pattern in the meta-language syntax described above. Each 
drivers 124 and 126 store and retrieve information on local record is therefore processed in turn to generate all the test 
disks 300 and 302. Data processing system 310 may be SQL commands required to test the database API in con- 
connected to other data processing systems 308 over net- junction with the selected database engine driver, 
work attachment 306. Additional database engine drivers Elements 412-420 are repetitively operable for each 
128 and local disks 304 may reside within the data process- 30 record (meta-language statement or query pattern) retrieved 
ing system 308. Database API 122 may interact with a from the text file. Element 412 tests whether the counter 
remote database en pine driver 128 through any of several variable "M" (indicating the number of meta-language state- 
well known network computing architectures. Further, one ments in the text file) has been decremented to zero. On each 
nf ordinary skill in the computing arts will readily recognize iteration of the loop (elements 412-420), element 420 is 
that the computing environment depicted in FIG. 3 is only 35 operable to decrement the counter variable "M." Elements 
exemplary of one such architecture in which the structures 414-418 are therefore operable to generate the test com- 
and methods of the present invention may operate. The mands specified a single query patterns for a single data type 
present invention is equally applicable to computing envi- supported by the selected database engine driver, 
ronments without networked connections to other data pro- Element 414 parses the meta-language statement to pro- 
cessing system or to distributed computing environment 40 cess all query elements and query list elements. Parsing of 
utilizing other topological configurations or connectivity the meta-language statement includes locating all query 
technologies. elements and replacing them by values appropriate to the 

FIG. 4 is a flowchart depicting the methods of the present particular data type presently being processed and as appro- 
invention as implemented by the query profiler 200. Element priate for the selected database engine driver. Additionally, 
400 of FIG. 4 invokes functions in the database API (122 of 45 the parsing process locates any query list elements in the 
FIG. 2) required to associate the test procedure with a meta-language statement and generates one SQL cornmand 
particular database engine driver module under test (124, for each element in the list. Each of the generated SQL 
126, or 126 of FIG. 2). Element 402 then generates test data commands are thereby generated by substitution of actual 
in tables created and managed by the database engine driver values for the variable elements of the meta-language state- 
under test. This test data is used by the selected database 50 ment 

engine 124. 126, or 128 through the database API 122 at the Element 416 then applies the SQL cornrnands generated 

direction of the query profiler 200 in its interpretation of the by element 414 to the database API 122. The SQL com- 

meta-language statements. Since the query profiler generates mands so applied are in turn transformed and transferred to 

the test data, it can predict the expected result of each SQL the selected database engine driver 124, 126, or 128 of FIG, 

command generated from the meta-language statements and 55 2 for actual processing upon the test data stored on the mass 

applied to the database API and engine. The specific form of storage devices (300, 302, and 304 of FIG. 3). Element 418 

the generated tables is a matter of design choice made by the captures, records, and analyzes the results of the SQL 

test engineering staff in creating the test procedures. One or command processing returned by the database engine driver, 

more tables may be created and each table may have one or Processing of these results is discussed below in additional 

more columns as desired by the test engineers to adequately 60 detail. 

test the database API interface to the database engine driver. As noted above, element 420 is next operable to decre- 

Elements 404 and 406 initialize for the looping functions ment the Loop counter variable "M" and element 422 

performed by elements 408-420. The test SQL commands decrements the loop counter variable "N" to control the 

generated for testing the API interface to the engine are iterative looping of the method. When element 412 deter- 

generated for each data type supported by the underlying 65 mines that all records in the meta-language text file have 

database engine. Element 404 sets the variable "N" to the been processed, it returns control to element 422 to process 

number of data types supported by the selected database another supported data type. likewise, when element 408 
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determines that all supported data types have been Element 530 of FIG. 7 separates the query list elements into 

processed, the method completes processing. the individual values (the comma separated values of the 

FIGS. 5-7 combine to provide a flowchart providing list). Element 532 sets the counter variable "K" to the 

additional detail of the operation of element 414 of FIG. 4 number of value elements in the list If element 534 deter- 

which generates all SQL commands from a single query 5 mines that there are no more values in the list to be 

template (meta-language statement). Element 500 of FIG. 5 processed, then processing continues by returning to ele- 

places the query pattern (meta-language statement) to be ment 520 of FIG. 6. 

parsed into a memory input buffer. Element 502 of FIG. 5 For each value in the list, elements 534^542 are invoked 
then initially invokes the reentrant parser to parse the tokens to generate an SQL command in the output buffer. Element 
of the meta-language statement. Tokens in the meta- 10 536 first clears the output buffer generated up to this point 
language statement (query pattern or template) are, in their (by earlier operation of elements 506-520 of FIG. 6). Next, 
simplest form, fields of non-space characters separated by element 538 creates a new input buffer with the current input 
spaces. Each token is therefore either a query element (if it buffer but with the query list element (now being processed) 
is delimited by angle braces), or a query list element (if it is replaced by the next value from the list Element 540 then 
delimited by square braces), or is a constant textual string 15 invokes the reentrant parser function to re-parse the new 
which forms a constant portion of the desired SQL command input buffer with the currently processed query list element 
to be generated. There may be a plurality of query elements replaced by its next value from the list. After processing of 
or query list elements in a single meta-language statement the revised meta-language statement (the new input buffer) 
In addition, the elements of a query list element may is complete, and the associated SQL commands are 
themselves be other query elements or query list elements. 20 generated, processing continues in the present invocation of 
(Le. nested variable portions of the query template). Fortius the parser with element 542 decrementing the loop count 
reason, the parser of the query profiler of the present variable **K W to indicate another value in the list is pro- 
invention is reentrant so as to permit parsing of nested cessed Upon completion of the processing of the present 
variable elements within the template. query list element, processing continues at element 520 if 
FIG. 6 depicts the details of the reentrant parser of the 25 FIG. 6 to process me remaining tokens of the meta-language 
query profiler. The parser is entered in a reentrant manner: statement 

i.e. saving previous status and allocating local variables on Processing continues in this manner for each value in the 
a stack- Element 504, sets the counter variable "J" to the query list element until all SQL commands represented by 
number of tokens found in the input buffer counter varible the query pattern (meta-language statement) have been 
J is provided as a parameter to the reentrant function. 30 generated. One of ordinary skill in the art will recognize that 
Elements 506 and 520 are operable to loop on the invocation other farms of recursive of reentrant designs of the method 
of elements 510^518 (and 530-542 of FIG. 7 below) for of the r^esent invention may achieve the same purpose. Such 
each token found in the input buffer. If element 506 deter- design choices for reentrant or recursive methods are well 
mines that all tokens in the input buffer have been processed, known to those of ordinary skill in the software arts. In 
element 508 is operable to generate the completed SQL 35 addition, the methods of the present invention may be 
command in the output buffer. The completed command is simplified by restricting the meta-language syntax to pro- 
then applied to the database API (122 of FIG. 2) as discussed hibit the nesting of, or even a plurality of, query list 
above with respect to FIG. 4. If further tokens remain to be elements. Such a design choice eliminates the need for 
processed, element 510 is operable to get the next token recursion in the processing of the meta-language. Again, 
from the input buffer for further processing. 40 such design choices are well known to those of ordinary skill 

Element 512 determines if the token to be processed is a in the softwa re art s, 
query element type of token (i.e. delimited by angle braces). BNF DESCRIPTION OF GRAMMAR RULES: 
If so, element 514 is operable to copy the replacement value The meta-language of the present invention may be 
for the query element (as discussed above) into the output understood as a set of grammatical rules for the formation of 
buffer. This replacement value stands in place of the query 45 legal statements within the grammar. A BNF format des crip- 
element in the SQL command being generated from the tion is a common format in which to express such rules. The 
query template. Processing then continues at element 520 by following BNF rule description includes the entire SQL 
looping through the process. standard language grammatical rules from which the rules of 

If the token is not a query element, the element 512 the present invention are an extension. The extensions to the 
determines whether the token is a query list element (Le. 50 SQL grammar defined by the rules of the present invention 
delimited by square braces). If not, the token must be a are highlighted in bold characters to distinguish them from 
constant portion of the query pattern and so is simply copied the standard rules which comprise the standard SQL lan- 
to the output buffer to become a constant part of the guage. For added clarity, the enhancements to the SQL BNF 
generated SQL command. If the token is a query list grammar rules all have identifiers that begin with the char- 
element, processing continues at element 530 of FIG. 7. actors "QP". 
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15 Elements used in SQL Statements: 

all-function ::= {AVQ | MAX | MIN | SUM} (expression) 
approximate-numerioliteral ::= mantissa Eexponent 
approximate-numertc-type ::= {approximate numeric types} 
20 argument-list ::= expression | expression, argument-list 
base-table-ldentifier :: = QP-base-tablej)ame 
base-table-name ::~ base-table-identifler 
| owner-name&ase-table-fdentffier 
| qualifier-name qualifier-separator base-table-identifier 
25 | qualifier-name qualifier-separator [owner-name].base-tab!e-identifier 

between-predicate :: = 

expression [NOT] BETWEEN expression AND expression 
binary-literal ::= {implementation defined 



30 




M-fype ::= {bit types} 
boolean-factor ::= [NOT] boolean-primary 
boolean-primary ::= predicate | ( search-condition ) 
boolean-term ::= boolean-factor [AND boolean-term] 
35 character {any character in the implementofs character set} 
characterstrfng-literaJ :: = '{cfia/acter}../ 
character-string-type ::= {character types} 
column-alias QP-alias 
column-Identifier ::= QP-cofumn-idenWier 



16 
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column-name::- [table-name.] column-identifier ' 
column-name ::= [{table-name \ correlation-name}. )column-identifier 
comparison-operator <|>|<=|>o| = |<> 
comparison-predicate expression comparison-operator expression 
5 comparison-predicate :: « expression QP-oomparison-tist expression 
comparison-predicate ::= expression QP-auter-ioin-list expression 
comparison-predicate 

expression comparison-operator {expression \ (sub-query)} 
correlation-name ::= QP-aJias 
10 cursor-name QP-cufsor-name 
data-type ::= character-string-type 
data-type :: = 

cfra/acter-str/ng-fype 
| exact-numeric-type 
15 j approximate-numerfc-type 

data-type :: = 

| exact-numeric-type 

| approximate-numeric-type 
20 | 

j binary-type 

| date-type 

| time-type 

| timestamp-type 
25 date-separator ::= - 

date-type ::= {date types} 
dafe-va/ue :: = 

years-value date-separator months-value date-separator days-value 
date-value :: = 
30 OP-sqi-date-dme-list 
days-value ::= digit digit 
digit : : = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 
distinct-function :: = 

{AVG | COUNT | MAX | MIN | SUM} (DISTINCT column-name) 
35 dynamic-parameter ? 
empty-string ::= 
escape-character ::= character 
exact-numeric-literai :: = 

[+ |-] { unsigned-integer[Mnsigned-integer ] 
40 | unsigned-integer. 

j .unsigned-integer } 
exact-numeric-type ::= {exact numeric types} 
exlsts-predlcate ::= EXISTS ( sub-query) 
exponent [+ |-] unsigned-integer 
45 expression ::= fer/n | expression {+ 1-} fer/n 

expression ;:= fer/n j expression QP^nath^operation-iist term 

17 
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factor :;= [+ \-]primary 
hours-value digit digit 
Index-identifier ::= QP-jndex-name 
index-name ::= [lndex-quaJifier.]index-identifier 
5 index-qualifier QP-index-quaMer 

in-predicate ::= expression (NOT) IN {{vaJue {, va/ue}...) | (sub-query)} 
insert-vaiue :: = 

c^na/n/c-paramefer 

| literal 
10 | NULL 

| USER 
keyword :: = 

(see list of reserved keywords) 
length :: = unsigned-Integer 
15 /effer ::= tower-cas&-/effer | upper-case-letter 

like-predicate ::= expression [NOT] UKE pattern-value 
like-predicate ::= 

express/on [NOT] UKE pattern-value [ODBC-like-escape-clause] 
literal ::~ character-string-Hteral 
20 //tera/ character-string-literal | numeric-literal 
literal ::= character-stnng-titeral 

| numeric-literal 

j bit-literal 

j binary-literaJ 
25 | ODBC-date-time-extension 

lower-case-fetter * * — 

a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|8|t|u|v|w| 
x | y | z 

mantissa ::= exact-numerlc-iiteral 
30 minutes-value ::= tf/g/f ctfg/f 
months-value :: = tf/§/f d/g/f 
null-predicate :: = column-name IS [NOT] NULL 
numeric-literal ::= exact-numeric-literal \ approximate-numeric-lfteraf 
ODBC-date-literal :: = 
35 ODBC-std-e$c-initiator d 'date-value' ODBC-std-esc-termlnator 

| ODBC-ext-esc-initiator d 'date-value 1 ODBC-ext-esc-terminator 
ODBC-date-time-extension :: = 
ODBC-cfafe-Wera) 
| ODBC-time-literal 
40 j ODBC-timestamp-literaJ 

ODBC-like-escape-clause 

ODBC-std-esc-fnltiator escape 'escape-character 4 

ODBC-std-esc-terminator 
| ODBC-ext-esc-initiator escape 'escape-character* 
45 ODBC-ext-esc-terminator 



18 
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ODBC-time-titeraS :: = 

ODOC ' S t d - csc ' toi ii Htot 1 , tiwB-vutuv' ODBC Meso torm i na tot- 



ODBC-ext-esc-inttiator t 'time-value 1 ODBC-ext-esc-terminator 
ODBC-tfmestamp-literal 
5 f ODDG - sM -e s&toitiQtor to 'tfmoa f a/ n p -w a/t/ e ' ODBC std - cao - tominat o r - 

| ODBOext-esc-inttiator ts 'timestamp-value' ODBOext-esc-terminator 
ODBC-ext-esc-lnitiator ::= { 
ODBC-ext-esclermlnator ;; = } 
ODBC-outer-join-extension :: = 
10 ODBC-std-esc-initiator oj outer-Join ODBC-std-esc-terminator 

| ODBC-ext-esc-inftiator oj outer-join ODBC-ext-esc-terminator 
ODBC-scalar-function -extension :: = 

ODBC-std-esc-initiator fn scalar-function ODBC-std-esc-terminator 
| ODBC-ext-esc-initiator fn scalar-function ODBC-ext-esc-terminator 
15 ODBC-std-esc-initiator :: = ODBC-std-esc-prefix SQL-esc-vendor-clause 
ODBC-std-esc-prefix ::=-(* 
ODBC-std-esc-terminator *)— 

order-by-clause ::- ORDER BY sort-specification [, sort-specification].., 

outer-Join ::= table-name [correlation-name] {LEFT | RIGHT | FULL} 
20 OUTER JOIN{tao/e-na/ne [corre/atfon-name] | outer-Join} ON search- 

condition 

owner-name QP-current-quaiffier 

pattern-value : : = character-string-literal \ dynamic-parameter 

pattern-value :;= character-string-literal j dynamic-parameter \ USER 
25 precision ::° unsigned-integer 

predicate ::= comparison-predicate | like-predicate \ null-predicate 

predicate 

between-predicate | comparison-predicate | exists-predicate 
| in-predicate \ like-predicate | null-predicate | quantified-predicate 
30 primary column-name 
| dynamic-parameter 
j //Yera/ 

| (expressfon) 
primary ::= column-name 
35 dynamic-parameter 
literal 

set-function-reference 
USER 

( expression ) 
40 primary ::= column-name 
| dynamic-parameter 
| //(era/ 

| ODBC-scalar-function-extension 
j set-function-reference 
45 | USER 

j { expression ) 

19 
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procedure :: = procedure-name | procedure-name (procedure-parameter-list) 
procedure-identifier ::- QP-prxxzddure-identffier 
procedure-name ::= procedure-identifier 
| owner-name-procedure-identifier 
5 | qualifier-name qualifier-separator procedure-identifier 

| qualifier-name qualifier-separator [owner-name].procedure-identifier 
procedure-parameter-list ;;= procedure-parameter 

| procedure-parameter, procedure-parameter-list 
procedure-parameter ::- dynamic-parameter \ literal \ empty-string 
10 ref -table-name ::~ base-table-tdentifier 
qualifier-name ::= QP-current-quafifter 
qualifier-separator ::= {implementation-defined} 
quantified-pred/cate ;:<= expression comparison-operator {ALL | ANY} 
(su/wyi/e/y) 
15 query-specification :: = 

SELECT [ALL | D!STINCT| select-list 
FROM table-reference-list 
[WHERE search-condition] 
[GROUP BY column-name, [column-name]...] 
20 [HAVING search-condition] 

ref-table-name :: = base-tabfe-identifier 
owner-name.base-table-fdentlffer 
qualifier-name qualifier-separator base-iabte-identifier 
qualifier-name qualifier-separator [owner-name].base-table-identifier 
25 referenced-columns ::= ( column-identifier [, column-identifier].., } 
referencing-cotumns ::= (column-identifier [, column-identifier].., ) 
scalar-function ::= function-name {argument-list) 
scale ::= unsigned-integer 

search-condition ::= boolean-term [OR search-condition] 
30 seconds-fraction ::= unsigned-integer 

seconds-value tf/g/f tf/g/f 

select-list * | select-sublist [, select-sublist 

select-sublist ::= expression 

select-sublist expression [[AS] column-alias] 
35 | {fafc/e-na/ne | correlation-name}* 

set-function-reference ::- COUNT(*) | distinct-function | alt-function 

sort-specification ::= {unsigned-integer \ column-name } [ASC | DESC] 

SQL-esc-vendor-ciause ::= VENDOR(Microsoft), PRODUCT(ODBC) 

sub-query 

40 SELECT [ALL | DISTINCT] select-fist 

FROM table-reference-fist 

[WHERE search-condition] 

[GROUP BY column-name [, co/u/nrwra/ne]...] 

(HAVING search-condition] 
45 table-identifier ::= QR-baseJabte-name 
table-name ::= table-identifier 

20 
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| owner-name.tableJdenttfter 

| qualifier-name qualifier-separator table-identifier 

| qualifier-name qualifier-separator [owner-name].tabJe-idenWier 
table-reference table-name 
5 table-reference ;:= table-name [correlation-name] 
table-reference ::= table-name [correlation-name] 

| ODBC-outer-join-extensfon 
table-reference-iist tabte-reference ^table-reference]... 
term ::- factor | ter/n {*|/} /actor 
10 time-separator:;-: 

time-type {time types} 
f/me-va/ae :: = 

hours-value time-separator minutes-value time-separator 
seconds-value 
1 5 tlmestamp-separator : : = 

(The blank character.) 
timestamp-type {timestamp types} 
timestamp-value ::= date-value timestamp-separator 
time-vaiue[Jsecohds-fraction] 
20 timestamp-vaiue::= QP-sqf-dat&time-list 
unsigned-integer : : — {u/y/t} . . . 
upper-case-letter :: = 

A | B | C | D | E | F | G | H | I | J | K | L | M | 
N|0|P|Q|R|S|T|U|V|W|X|Y|Z 
25 user-defined-name letter[digit \ letter | J... 
user-name ;:= user-defined-name 
value ::= //tera/ | USER | dynamic-parameter 
viewed-tabfe-ldentifler :: = user-defined-name 
viewed-tabfe-name :: = viewed-table-identifier 
30 | owner-name.vlewed-table-identifier 

| qualifier-name qualifier-separator viewecteable-ldentifier 
| qualifier-name qualifier-separator [owner-name] .viewed-table-identifier 
years-value :: = cffgtf cffgtf tf/gff d/g// 

35 function-name ODBC-string-functions \ 
ODBC-numericJunctions \ 
ODBOtime-and-date-functions \ 
ODBC-system-functions | 
ODSC-convert-ft/nctfon 

40 

ODBC-string-functions ;:= ASCII {string-exp) | 
CHAR(code) | 

OOHCA"T(string-exp1,string-exp2) \ 
0\FFttEMCE{stringj3xp1 'String jxp2) | 
45 INSERT(s*n/7g_exp^sta^/eng^,sWng_exp2) | 

LCASE(strfog_exp; | 

21 
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LEFT(string exp .count) | 
LENGTH (string jsxp) \ 
LOOAJE(string2exp1, string _exp2\,start\) \ 
[jn\M{strfngj3xp) \ 
5 REPEAT(string_exp t count) \ 

REPlACE($tring_exp1, string jaxp2, string_exp3) \ 
MGHl(stringj3xp t count) | 
HJB\M(string_exp) \ 
$OUUDEK{stringjsxp) \ 

10 souNDEX{coum; | 

B\GHT[string_exp ,$tart, length) \ 
HTR\M(stringj3xp) 

string _exp ;: = QP-colurrw-itjontifier \ 
15 QP-sql-char-tist \ 
string-literal | 

ODBC-string-functtons 
string jsxpl ;;= string _exp 
string~exp2 ;;- string jaxp 
20 length :: = QP-daJa-elenient \ 
QP-sqf-numeric-fist j 
number \ 

ODBC-numeric-functions 
start :;= QP-dat&etemerti \ 
25 number | 

QP-sql-numeric-fist \ 

ODBC-numeric-functions 
count::- QP-data-efement \ 
number \ 
30 QP-sql-numeric-tist \ 

ODBC-numeric-functions 

ODBC-numeric-functions ;;= &B$(numeric_exp) \ 

ACOSifloatjaxp) \ 
35 ASIN(/?oaf_exp; | 

ATAN(ftoat_exp) \ 

AlAK2{floatexp1 f float exp2) \ 
CE\L\t*Q(numeric_exp) \ 

COS{fJoat_exp) \ 
40 COT(fioatjBxp) | 

DEGREES(/ii/me//c_expJ | 

EXP(ftoatj&p) | 

FLOOH{numeric exp) \ 

LOG{float_exp) f 
45 LOGIQWoatjxp) \ 

M00(lntegerj3xp1, integer _exp2) \ 

22 
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PIO I 

POWER{numericjBxp, integer j^) | 

RADlANS(nu/nerfo_exp; | 

RAND([/nfeger_expy | 
5 HO\JND(numericjxp, integer jxp) \ 

S\GM{nu marie _exp) \ 

S\U{fioat_exp) \ 

SQRT(//oaf_exp) | 

TAN(//oaf_exp) | 
10 TOUNCATE(nu/neric_exp, integer _exp) 

numeric _exp ::= QP-data~eSement \ 
QP-coiumn-identffter | 
QP-sqf-numeric-Ilst \ 
15 number \ 

ODBC-numeric-funcVons 

fioatjBxp ;:= QP<Sata~efement \ 
OP-column-identifier | 
20 QP-$ql-numeric-tist | 
number j 

OOSC-/TLffnenc-ft/nctfons 

integer _exp ::- QP-data-elament \ 
25 QP-cviumn-fdentifier | 

QPsql-oumeri&iist | 
number \ 

ODBC-numeric-functions 

30 OD8C-time-and<tate-function$ ;;= CURDATEO I 

CURTIMEQ I 

DATED M E {datejexp) | 

DAYOFMONTH (tfafe exp) | 

DAYOFWEEK(cfefe exp) | 
35 D A YO FYEAR {datejaxp) | 

HOUR(ctefe exp) | 

MINUTE(da7e_exp) | 

MONTH(rfate_exp) | 

MONTHNAME(date_axp) | 
40 NOWO | 

QUARTER(date_exp) | 

SECOND(dafe_exp) | 

TIM ESTAMPADD(/nte/va/, integer exp, tsmestamp exp) \ 
TIM ESTAMPDlFF(/nte/va/, timestamp_exp1 , timestamp_exp2) \ 
45 WEEK(rfafe_axp) | 

YEAR(cfete_axp) 

23 
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datejsxp ;:= QP-datBrtlement | 
QP-column-kJentiffer | 
QPsql-dateHm&list | 
number \ 

5 ODBC-time-and-date-functions 

timestamp_exp ;;= QP-data-elemerti \ 
QP-cdumn-iderfflar | 
QP-sql-date-timeJfst | 

1 0 ODBC-time-and-date-functlons 

interval ;:= SQLJSIJRACJECOND | 

SQLTSI SECOND | 

SQLJSf MINUTE | 
15 SQL_TSI~HOUR | 

SQL TSf DAY | 

SQLTSf WEEK | 

SQL"TSI MONTH I 

SQL~TSf QUARTER | 
20 SQLTSf YEAR | 

ODBC-system-functions DATABASEQ I 
IFNULL(exp, va/ue) | 
USERO | 

25 

exp ;;= column-name 

exp column-name QP-fnath-operation-iist column-name 
value :;= QP-data-efement 

30 ODBC-convert-function <mWEm(QP-cclumri-kien1i6er , W-sqt-data-typeJist-elemenf) 
ODBC-convert-function CONVERT(va/ue-exp, data-type) 

ODBC-data-type SQL_CHAR | 

SQL_VARCHAR | 
35 SQL LONGVARCHAR | 

SQL~OECIMAL I 

SQL_NUMERIC | 

SQL_SMALUNT | 

SQLJNTEGER | 
40 SQL REAL | 

SQL~FLOAT | 

SQL_DOUBLE | 

SQLJINYINT | 

SQL_BIGINT | 
45 SQL_BINARY | 

SQL_VARBINARY | 

24 
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SQL LONGVARBI NARY | 
SQL'DATE I 
SQLJIMESTAMP 

5 ODBC-char-type SQL_CHAR | 

sql varchar | 
sql"longvarchar 

ODBC-numeric-type ::= SQL DECIMAL | 
10 SQL NUMERIC"! 

SQL"SMALLINT | 

SQLJNTEGER | 

SQL_REAL | 

SQL_FLOAT | 
15 SQL DOUBLE | 

SQLTINYINT | 

SQL~BIG1NT 

ODBC-binary-type ;:= SQL BINARY | 
20 SQLVARBINARY | 

S G L_LG N G V AR B i NAR Y 

ODBC-date-time-type ;;= SQL_DATE | 
SQLT1MESXAMP 

25 

^ QP-functiorhname : : = ASC1I( ) 

QP-pmcedumJdeatifier ::= < index-qualifier QP-number> 
QP-index-qualifier ::= < index-qualifier QP-number> 
QP-cursor-name ::= < cursor QP-number> 
30 QP-indax-name ::= < create table QP-number> 
QP-cunent-qualifier v. = < qualifier > 
Qp-base-ta&e-name ::= < table QP-nvmber> 
QP-tabfe-extensjon :: = <ext> 
QP-colurnrhiderffler :: = < column QP-number> 
35 QP-afias :; = < alias number> 

QP-data-etement ::= <data QP-number > 
QP-column-namo ::= < column name > 
QP<&umrhdefinrtion <column def> 
QP-/*sf :: = [QP-comparison-list \ 
40 QP-outec-join-list \ 

QP-matThopefaton-fist \ 
- QPsqt-data-typeJist | 
QP-sqt-dato-timeJist \ 
QP-sqhnumeric-itst | 
45 QP^sql-char-tist \ 

QP-cJause-iist ] 

25 
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QP-clause-iist :: = QP-clause-lisMement 

QP-ciause-fist-eJemant :: = QP-clause-tist-element | 
HAW/YG | 

5 GROUP-BY | 

QP-comparison-Jist :: = QP-comparison-iist-eJemem 
QP-outer-join-iist :: = QPoutEf-join-Hst-etemetn 
OP-math-opefatkxhftst :: = QP-math-operatkxUist^ement 
1 0 QP-sql-daia-fypeJist :: = QP^<tiia-typeJi$t^ement 

QP-number::= 0|1|2|3|4[5|6|7|8|9| 10.., 
QP-comparison-list-elemeftt ::= comparfson-list-etoment \ 

= I < I <« I > I >» I i= I i< M 
QP-outBr-foln-ilsMement :: = outer-foin-list-eiement \ 
15 * = | = * 

OP-math-opefation-eiement ::= QP-math-operatiorh&ement \ 

-l + l*l/l% 
QPsql-data-typeJIst-efement ::= QP-$ql<iata-typeJist~eJement | 

ODSC-cteta-iype 

20 

QPsqi-date-ttme-Tist ODBC-date-time-type 
QP-sqt-numeri&tist ODBC-numeric-type 
QPsqt-char-tist :: = ODBC-char-type 
QPsqt-binary-tist :: = ODBC-binary-type 

25 

SQL Statements: 

sfafe/nenf ;= atter-tabie-statement \ 

create-index-statement \ 
30 create-tabie-statement | 

create-we*v-sfafemenf | 

deiet3-$tatement-po$itloned | 

defete-statement-searched | 

drop-index-statement \ 
35 drop-tabte-statement \ 

drop-view-statement \ 

grant-statement \ 

insert-statement | 

revo*e-sfafeme/jf | 
40 select-statement | 

sefect-for-update-statement \ 
update-statement-posittoned \ 
update-statement-searched \ 

45 after-table-statement : : = 

ALTER TABLE base-tabie-name 

26 
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{ ADD column-Identifier data-type 

| ADD {column-identifier data-type [, column-identifier data-type].,. ) 
} 

5 atteMable-statement :: = 

ALTER TABLE base-table-name 
{ ADD column-identifier data-type 

| ADD (column-identifier data-type [, column-identifier data-type]... ) 
| DROP [COLUMN] column-identifier [CASCADE | RESTRICT] 
10 } 

create-index-statement 

CREATE [UNIQUE] INDEX index-name 
ON base-table-name 
15 ( column-identifier [ASC | DESC] 

[, column-identifier [ASC | DESC] ]... ) 

create-table-statement ::= 

CREATE TABLE base-table-name 
20 (column-element [, column-element] ...) 

coiumn-eiemeni coiumn-dafiniiion j iabie-mnsiraini-deiiniiion 
column-definition :: = 

column-Identifier data-type 
[DEFAULT default-value] 
25 [column-constraint-deflnition[ column-constraint-definition]...] 

column-constraint-definition :: = 
NOT NULL 

| UNIQUE | PRIMARY KEY 
30 | REFERENCES ref-table-name referenced-cofumns 

j CHECK (search-condition) 

table-constraint-definition :: = 

UNIQUE (column-identifier [, column-identifier] ...) 
35 | PRIMARY KEY (column-identifier 

I column-identifier] ...) 
| CHECK (search-condition) 
j FOREIGN KEY referencing-columns REFERENCES 
ref-table-name referenced-columns 

40 

create- Wew-stefe/nenf :: = 

CREATE VIEW viewed-table-name 

[( column-identifier [, column-identifier]... )] 

AS query-specification 

45 

detete-statement-posttioned = 

DELETE FROM table-name WHERE CURRENT OF cursor-name 
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deiete-statement-searched :: = 

DELETE FROM table-name [WHERE search-condition] 

5 drop-index-statement ;: = 

DROP INDEX index-name 

drop-table-statement :: = 

DROP TABLE base-tabie-name 
10 [ CASCADE | RESTRICT ] 

drop-view-statement :: = 

DROP VIEW viewed-table-nam e 
[ CASCADE | RESTRICT ] 

15 

grant-statement 

GRANT {ALL | grant-privilege [, grant-privilege].,. } 
ON table-name 

TO {PUBLIC | user-name [, user-name]... } 
20 grant-privilege :: = 

DELETE 
| INSERT 
| SELECT 

j UPDATE [( column-Identifier [, column-identifier].., )] 
25 | REFERENCES [( column-Identifier 

[, column-identifier]... )] 

insert-statement = 

INSERT INTO table-name [(column-identifier [, column-identifier]...)] 
30 VALUES (insert-valuel insert-vaiue]... ) 

insert-statement ::<= 

INSERT INTO table-name [( column-identifier [, column-identifier]... )] 
{ query-specification | VALUES (insert-value [, /nse/f-va/ue]...)} 

35 

revoke-statement :: = 

REVOKE {ALL | revoke-privilege I revoke-privilege]... } 
ON table-name 

FROM {PUBLIC | user-name [, user-name]... } 

40 [ CASCADE | RESTRICT ] 

revoke-privilege ::» 

DELETE 

| INSERT 

| SELECT 
45 j UPDATE 

| REFERENCES 
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select-statement :: = 

SELECT [ALL | DISTINCT] select-list 
FROM tabfe-reference-llst 
[WHERE search-condition] 
5 [order-by-clause] 

select-statement :: = 

SELECT [ALL | DISTINCT] setecNiSt 
FROM table-reference-list 
10 [WHERE search-condition] 

[GROUP BY column-name [, column-name]... ] 
[HAVING search<ondition] 
[order-by-clause] 

15 select-statement :: = 

SELECT [ALL | DISTINCT] select-list 

FROM tabfe-reference-list 

[WHERE search-condition] 

[GROUP BY column-name [, column-name]... ] 
20 [HAVING search-condition] 

[UNION [ALL] seied-siaiemeni]... 

[order-by-clause] 

select-for-update-statement :: = 
25 SELECT [ALL | DISTINCT] select-list 

FROM table-reference-list 
[WHERE search-condition] 
FOR UPDATE OF [column-name (, column-name]...] 

30 update-statement-posrtioned : : = 
UPDATE table-name 

SET column-identifier - {expression | NULL} 

[, column-identifier = {expression | NULL}]... 
WHERE CURRENT OF cursor-name 

35 

UPDATE table-name 

SET column-idenffler = {express/on | NULL } 
I column-identifier = {expression \ NULL}]... 
40 [WHERE search-condition] 
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While the invention has been illustrated and described in 
detail in the drawings and foregoing description, such illus- 
tration and description is to be considered as exemplary and 
not restrictive in character, it being understood that only the 
preferred embodiment and minor variants thereof have been 5 
shown and described and that all changes and modifications 
that come within the spirit of the invention are desired to be 
protected. 

What is claimed is: 

1. Computer interpretable grammatical rules for generat- 10 
ing a plurality of queries in a grammar for testing a database 
engine driver, said grammatical rules comprising: 

static elements for generating constant portions of said 
plurality of queries, wherein said static elements are 
copied to a buffer associated with a computer to gen- 15 
erate said plurality of queries from said grammatical 
rules; and 

variable elements selected from at least one of a group 
consisting of: a query element and a query list element, 
for generating database engine driver specific portions 
of said plurality of queries, wherein said variable 
elements are replaced in said buffer by values specific 
to a particular database engine driver to generate said 
plurality of queries from said, grammatical rules. 

2. The grammatical rules of claim 1 wherein said query 
element is enclosed by a start delimiter and an end delimiter. 

3. The grammatical rules of claim 2 wherein said start 
delimiter is a less-than character ("<*") and said end delimiter 
is a greater-than character (**>"), and a comma character 
(">") separates adjacent ones of said plurality of values in 
said query element 

4. The grammatical rules of claim 1 wherein said query 
list element is enclosed by a start delimiter and an end 
delimiter. 

5. The grammatical rules of claim 4 wherein said start 35 
delimiter is a left square brace character ("[*')> said end 
delimiter is a right square brace character ("]")* ^ a 
comma character (",'*) separates adjacent ones of said plu- 
rality of values in said query list element 

6. A computer operable method for testing a database 40 
driver, said method comprising: 

parsing a meta-language statement into at least one meta- 
language statement token each comprised of at least 
one token element where any one of said at least one 45 
meta-language statement token that is comprised of 
more than one token element is a variable token ele- 
ment delimited by a pair of variable token element 
delimiters and said variable token element is a type 



30 



selected from at least one of a group consisting of: a 
query element and a query list element; 

expanding said meta-language statement into a plurality 
of meta-language test queries comprised of one of said 
plurality of meta-language test queries for each unique 
combination of said at least one token element in each 
of said at least one meta-language statement token; 

generating a plurality of data type specific test queries 
from said plurality of meta-language test queries by 
direct substitution of a data type specific database 
driver query element for each substitutable one of said 
at least one token element in each of said plurality of 
meta-language test queries; 

repeating said step of generating for each data type 
supported by said database driver; and 

applying said plurality of data type specific test queries to 
said database driver. 

7. A method according to claim 6 wherein said pair of 
variable token element delimiters for said query element 
include a less-than symbol ("<") as a start delimiter and a 
greater-than symbol (">") as an end delimiter, and individual 
tokens of said query element are separated by a comma 
symbol (","). 

8. A method according to claim 6 wherein said pair of 
variable token element delimiters for said query list element 
include a left square brace symbol ("[") as a start delimiter 
and a right square brace symbol ("]") as an end delimiter, 
and individual tokens of said query list element are sepa- 
rated by a comma symbol (",*'). 

9. A rule-based test apparatus for generating a plurality of 
test commands to test a database engine driver, said appa- 
ratus comprising: 

a memory; 

a processor connected to said memory; 

a plurality of rules, stored in said memory, wherein each 
of said plurality of rules is encoded in a meta-language 
syntax used to represent a plurality of test commands 
that are executable on corresponding ones of a plurality 
of database engine drivers; and 

processing means, operable in said processor, for parsing 
a meta-language statement input and for generating 
each of said plurality of test commands according to 
said plurality of rules and for applying each of said 
plurality of test commands to said database engine 
driver. 
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ABSTRACT 



An Application Programming Interface (API) provides 
interoperability between different monitoring and adminis- 
trative components of a data warehouse system that utilizes 
different standard protocols. One of the protncnls is the well 
known data connectivity protocol, Open Database Connec- 
tivity (ODBC) that defines a standard interface between 
applications and data sources. A second one of the protocols 
is the well known network management protocol, Simple 
Network Management Protocol (SNMP) that defines a stan- 
dard interface between an agent component and a network 
management system. The API provides a facility that 
enables the different components to access user and con- 
nection information maintained by an ODBC server com- 
ponent derived from servicing client system application 
SQL queries made by system users. 
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APPLICATION PROGRAMMING Although ODBC provides a common PC based API, each 

INTERFACE FOR MONITORING DATA relational database management system (RDBMS) vendor 

WAREHOUSE ACTIVITY OCCURRING typically has implemented a unique interface for data access. 

THROUGH A CLIENT/SERVER OPEN To adapt tools based on ODBC to the interfaces used by 

DATABASE CONNECTIVITY INTERFACE 5 various types of RDBMS, Microsoft Corporation specifies 

DAP^nniixm nr tuc iNn^mrw tne development of a "driver". The driver transforms the 

BACKGROUND OF THE INVENTION ODBC API standard calls into RDBMS specific calls. The 

1. Field of Use usc of ODBC provides a layer of consistency above each of 

The present invention relates to systems and methods for the APIs implemented by the RDBMS vendors. In the prior 

monitoring information accesses and more particullrly the art, a separate ODBC driver was required for each type of 

usage of a data warehouse and the information contained RDBMS to be accessed. Additionally, each database vendor 

therein. typically, requires a tailored communications link An 

2 Prior Art improvement to this approach is to provide a single data 

J t , . • _ j • r* nni access (DDA)ODBC driver to replace multiple customized 

Data warehouses are becoming more and more important ^^^^S . ' ... . , • , . \* t 

ft ™ , « j * l. » * if j ODBC drivers with a single implementation that can access 

to businesses. The term "data warehouse is generally used 15 Qf 

to describe a database containing data that was gathered M \ x ^ le of the above type of syslem * the distributed 

from a variety of sources (e.g. existing production ^ ^ ^ middleware described in lhe artic i e 

databases). For more information regarding the nature of a ^ < (Jhc Distri buted Data Warehouse Solution"by Kirk 

data warehouse, reference may be made to the article M oshe r and Ken Rosensteel that also appeared in the above 

entitled, "Data Warehousing: An Introduction" by Grayce 20 referenced May/June 1995 of the Technical Update Journal. 

Booth which appeared in the May/June 1995 issue of the systcm utilizes a proprietary based infrastructure called 

Bull S. A. technical journal entitled, "Technical Update." DDW/NET that works in conjunction with the DDW/ODBC 

Typically, the data warehouse is implemented as a large driver. DDW/NET enables connections to multiple corn- 
amount of data stored in a database with access to the data p U ter architectures, operating systems, and network proto- 
coming from hundreds of users executing commodity appli- 25 cols. The DDW/NET software resides on each of the legacy 
cations like Excel, running on personal computers (PCs). and server systems that communicate over standard corn- 
Here, an opportunity for a business exists to manage the data munications links and hides the details of networking from 
warehouse system. It is useful to the warehouse owner to the upper layers of software on each system, 
have information and statistics about the usage of the The above prior art system included several features to aid 
warehouse and its data. Such information includes: (a) how 30 the administrator of the data warehouse. Such features 
many users are currently logged onto the system; (b) what is included an SNMP agent that monitored the activity of the 
the pattern of access statistics; (c) what data is accessed most distributed data warehouse (DDW) processes and users of 
frequently, (d) what if any indexes could be added or the data warehouse and a Usage Monitor facility that 
dropped to improve access efficiency; (e) what if any unlaw- recorded SQL database queries issued by individual users, 
ful access attempts have occurred; and (f) what query runs 35 Each of these features required the use of an interface to the 
the longest. Some of this information can be obtained from DDW Net on a UNIX based platform to help gather the 
the warehouse database system but each type of database required information. This approach required the use of 
gathers this information in a different proprietary manner. proprietary interfaces that made it difficult to expand the 
Therefore, there is an opportunity to be able to provide usage types of databases used by the system. The data that was 
data in a standard fashion for all database types. Also, there 40 needed was not easily accessible from the DDW Net 
is the ability to provide the information through standard memory. DDW Net design was based on Ingres technology, 
system management tools based on standard protocols, such that could not be easily enhanced. This prior art approach is 
as the Simple Network Management Protocol (SNMP). described in the publication entitled DDW Administrator's 

As well known in the art, the Open Data Base Connec- Guide, dated Apr. 25, 1997, copyright Bull S. A. and Bull 

tivity (ODBC) application programming interface is a stan- 45 HN Information Systems Inc. 1995, 1996, 1997, Order 

dard defined by Microsoft Corporation by which Windows Number 86 A2 83FC Rev4. 

based tools and applications may access different databases Accordingly, it is a primary object of the present invention 

on many different server platforms. Many PC vendors have t0 pr0 vide a system and method for facilitating monitoring 

adopted ODBC. Without using ODBC, applications are 0 f data warehouse activity. 

required to use APIs specific to a database vendor for 50 It is a further more specific object of the present invention 

accessing data warehouse information. Using ODBC, an to prov i de ^ interface arrangement that simplifies data 

application may access any type of database. In addition, warehouse monitoring through standard protocols. 
ODBC is used by application tools such as EXCEL such that 

specific code is not required for each database type being SUMMARY OF THE INVENTION 
accessed. 55 The above objects are achieved in a preferred embodi- 
Client/Server ODBC is a newer technique for implement- ment of the present invention that provides a special appli- 
ing ODBC. The interface to ODBC for user applications cation programming interface (API) that provides interop- 
remains on the PC but the bulk of the ODBC logic is moved erability between standard protocols utilized in conjunction 
to a server side implementation. All PC users execute their with the monitoring and administration managing tool corn- 
data requests through a common ODBC server. This 60 ponents of a data warehouse system. One protocol is the well 
arrangement provides a "thin" client requirement for the PC known data connectivity protocol Open Database Connec- 
user of ODBC and makes the administration of ODBC tivity (ODBC), that, defines a standard interface between 
possible from a single server. This single point of access applications and data sources. Another protocol is the well 
through the ODBC server also provides the opportunity for known network management protocol Simple Network 
administrating the data warehouse. All clients PCs that need 65 Management Protocol (SNMP) that defines a standard inter- 
to access the data warehouse come through the single point face between an agent component and a network manage- 
of access (Le. ODBC server). ment system. 
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In the preferred embodiment, the warehouse components DESCRIPTION OF THE PREFERRED 

include a local SNP agent component for gathering data EMBODIMENT 

pertaining to the activity of a distributed data warehouse ^ 

(DDW) processes and the users of the DDW system and a . . . , r j- * * j j , u 

v u , r ♦ i- ♦ ♦ • t -L u ♦ *u , FIG. 1 is a block diagram of a distributed data warehous- 

usage monitor component for tracking statistics about the 5 . /TAM , rt A « • i j 4i_ a v *• n 

j-ff t t cer\r • * ju-a--ai . in g (DDW) system 10 that includes the Application Pro- 

dinereot types of SQL que nes issued by individual system &v . % ; , r . - 

— a j- » ,u ♦ • «• iu- „„~Ln.~- gramming Interface (API) of the present invention. As 

users. According to the present invention, the warehouse 6 , , & _____ _ ■_«■. j _ . _. _ i_ 

components further include ODBC server and driver com- shown, the DDW system 10 includes a target data warehouse 

poneDts for operatively connecting to the DDW system 20 that operattvely connects to a plurahty of ODBC 

f * i j . i r • cor ™u ,n Windows based personal computer (PC) client systems 26a 

target warehouse database for processing SQL queries sub- 10 - . ^ £ y j- u * 

'tiAu u 1 i j i tu AnDP^, - through 26c over a corresponding number of commumca- 

mitted by warehouse knowledge workers. The ODBC server % . . , ? cm/D 

7 . , *• i i « enr i«„ tu,. # uons links 25a through 25c and to an SNMP server corn- 
component also operatively couples to an SQL log that it & . _. . „ ft , , 

, • * • . • .* • • ♦ cm ponent 30 over a communications link 29. The data ware- 
uses to maintain entries pertaining to user SQL queries it f „ rt _ . . . . 

f u r n nDr r ♦ r ,, r ._ 0 -« house system 20 manages a target warehouse database 23 

receives from a number of ODBC client user systems. The \. 4U * u * • i * *u a * 

4 . , , ' iU representing the database that implements the data ware- 
usage monitor component operatively couples to the SQL 15, r * , . . _. , _ ^-vi; * .n 

i j _r Z r # . f *u • j.*. f +u_, house or data mart. As indicated, the DDW system 10 

log and performs the function of gathering data from the ... . ^t>^ . u \ .u < *u 

* . Z * u * i * 4 u j <■ utilizes client/server ODBC technology that removes the 

en nes that it uses to populate tables of a usage monitor iremen , for a DBMS network co ^ ction on the c i ient 

database that it maintams for providing usage statistics. ^ >nd software fequirements ^ is> ^ pc 

The SNMP agent component performs further monitoring requirement is reduced to the application to be run, the 

functions. The component operatively couples to the ODBC 20 ODBC client software and a WinSock TCP/IP connection, 

server component through the special API that enable such The interface to the ODBC client system remains consistent 

components to have access to a variety of types of infor- w j tn the ODBC interface standard requiring no changes to 

mation received from the ODBC server through the ODBC existing applications. An ODBC server component 20-12 is 

protocol and reportistore such information in a MIB data- U sed to make the connection to the client systems. The 

base of a further warehouse component that corresponds to 25 ODBC server component 20-12 acts as an application to a 

a centrally located SNMP server component via the SNMP standard ODBC driver component such as component 

protocoL 20-10. The ODBC server component 20-12 runs as any other 

In accordance with the present invention, the special API application in an assigned area of memory that it uses to 

provides the following types of information: Server Listen 30 provide routines and store table structures required for 

Address and Number of Active Connections. For each active processing database queries and responding to API calls, 

connection, the Connection ID, login time, number of mes- The organization of the ODBC aliucateu memory area wiii 

sages sent, number of messages received, number of bytes be discussed in greater detail with reference to FIG. 2. 

received, last message and last message direction. The ODBC driver component 20-10 provides the access 

Additionally, the API provides other configuration informa- 35 to the target database 24 that may be implemented as any one 

tion relevant to the server such as network ports used and of a number of well-known vendor database systems (e.g. 

server name. The special API is used by the local SNMP ORACLE, INFORM-, etc.). When more than one type of 

agent component to gather real time information about data database system is used, the ODBC server component 20-12 

warehouse usage and reports that information to the cen- utilizes Windows ODBC driver manager software. This 

trally located manager server unit. 4Q software provides a thin layer interface to a number of 

In accordance with the teachings of the present invention, di ^ rei J l DBMS ODBC drivers that enables applications to 

the ODBC server component operatively records entries lo **™ ch A dnv ™ f° their a PP llcatl0DS ™ d thus 

having a predetermined format (e.g. ASCII format) into two remain independent 

log files. Entries for every user login to the ODBC server As shown, the ODBC server component 20-12 opera- 
component are recorded in the first log file. Every SQL 45 lively couples to an ODBC server SQL database/log 23 that 
statement sent to the ODBC server component and infor- il uses to maintain information entries having a predeter- 
mation identifying the user that issued the statement, the mined format for recording client application DBMS 
time of execution, the elapsed time of the etc. is recorded in accesses. Each database write is made via the ODBC server 
the second log file. The usage monitor component periodi- component 20-12 that in turn results in the appropriate log 
cally reads the second log file and writes the statistics about 50 entries being written into the SQL log 23. 
usage into the usage monitor database. By using the ODBC For the purposes of the present invention, the ODBC 
serve information, no software components need be inserted components may be implemented with standard ODBC 
between the end users and the data warehouse to gather the software components provided by Microsoft Corporation, 
usage information. Since the OD The implementation of ODBC drivers is well known in the 

„ . . . . , , - . . 55 art. For example, reference may be made to an article 

THe above objects and advantages of the present mvenUon entit]e<J <<Wri J 0DBC Driver / bv Dennis R . McCartb 

will be better unders ood from the followmg description ished . ae Member 1995 issue of the publication, Dr. 

when taken in conjunction with the accompanying drawings. D 0D b's Journal 

BRIEF DESCRIPTION OF THE DRAWINGS ^ ^ COm P 0ne , nt 30 , ak ° COn ° eC,S 

60 to other SNMP agents via communications links such as 

FIG. 1 is an overall block diagram of a data warehousing SNMP a S ent 34 via a communications link 33. More 

system that includes the API of the present invention. specifically, SNMP server component 30 includes SNMP 

dispatcher daemon software that enables the component to 

FIG. 2 illustrates a memory map organized according to host mQre than Qne SNMp agem at the same time ^ 

the present invention. 6S SNMp server 30 inc]udes an SNMP administration interface 

FIG. 3 is a flow chart used in describing the operation of 32 and a Management Information Base (MIB) database 28. 

the present invention. The MIB database 28 is organized in a tree structure whose 
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branches identify information objects. Each object within 
the MIB corresponds to an item of information In the 
preferred embodiment, a level of the tree structure is allo- 
cated to data warehouse objects. This organization is 
described in greater detail, in the above referenced DDW 
Administrator's Guide. An appendix included herein pro- 
vides a list of data warehouse objects utilized by the API of 
the present invention. 

The system 20 includes a set of tool components that 
includes an SNMP agent component 20-2 that operatively 
couples to a usage monitor component 20-6. The SNMT 
agent component 20-2 is the facility utilized to administer 
the data warehouse system 20 from the SNMP manager 
server unit 30 via communications link 29 utilizing the 
SNMP protocol. 

The SNMP agent component 20-2 contains the necessary 
mechanisms for gathering data from the ODBC server 
component 20-12 as described herein The SNMP agent 
component 20-2 utilizes the SNMP server unit's MIB data- 
base 23 for recording required information utilized in moni- 
toring activities conducted via the SNMP administration 
interface 32. In the preferred embodiment, the SNMP server 
unit 30 and interface 32 corresponds to the ISM server 
developed and marketed by Bull HN Information Systems 
Inc. 

The usage monitor component 20-6 also operatively 
couples to a usage database 22 and to the ODBC server SQL 
log 23. The component 20-6 uses the information contained 
in log 23 for generating and maintaining statistics pertaining 
to client user database activities. 

The SNMP agent component 20-2 operatively couples to 
the ODBC server component 20-12 through an API library 
component 20-8 constructed according to the teachings of 
the present invention. As described herein in greater detail 
the API library component 20-8 that can be considered part 
of the ODBC server component 20-12. The component 20-8 
provides a number of functions that enable the tool compo- 
nents of the warehouse system 20 to cooperate with the 
ODBC server component 20-12 in a manner required to 
carry out the required monitoring and administration of 
client system activities. FIG. 2 -Memory Map 

FIG. 2 illustrates in greater detail, the organization of the 
allocated memory area utilized by the ODBC server com- 
ponent 20-12. As shown, the memory area includes the API 
library component 20-2 and several key tables utilized by 
ODBC server component 20-12. These key tables are: a 
servers table 200-1 for keeping track of servers (ODBC or 
SNMP), a connections table 200-2 for keeping track of 
active end user connections that are using ODBC client 
software to query the data warehouse, and a requests table 
200-3 for keeping track of active SQL requests from the end 
users. 

The API library has one function call per each of the 
ODBC server tables. Each of the API library calls will return 
all of the rows of the table being addressed. The first 
invocation of the function takes a snapshot of the ODBC 
server memory and allows all of the ODBC operations that 
will affect the contents of the memory to continue. This first 
call then returns the first entry from the table. Subsequent 
calls to the API return the subsequent rows of the snapshot, 
until an end of file (EOF) indicator is reached. The EOF is 
signaled through a "RETURN_CODE" parameter of the 
function call. The API functions are set forth in greater detail 
in an Appendix. 

DESCRIPTION OF OPERATION 

With reference to FIGS. 1 through 3, the operation of the 
preferred embodiment of the present invention will now be 
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described relative to the flowchart of FIG. 3. Referring to 
FIG. 3, it is seen from block 300-1 that initially, the systems 
are activated. That is, the ODBC server component 20-12 is 
started up by the administration SNMP Administration inter- 

5 face component 32. Startup is invoked using appropriately 
Kora shell scripts implemented in a conventional manner. In 
addition, ODBC applications such as EXCEL are assumed 
to have been loaded and running on the client systems 26a 
through 26c. During the running of these applications, end 
users issue queries that invoke access to target warehouse 

1 database 24 as indicated by block 300-2. This results in the 
ODBC client software generating calls to the ODBC server 
component 20-12 as indicated in block 300-3. Each call 
utilizes the ODBC interface protocol that proceeds via 
TCP/IP network interfaces. 

15 As indicated in block 300-4, the ODBC server component 
20-12 monitors for end user requests and forwards the end 
user SQL access request to the data warehouse database 24 
via ODBC driver component 20-10. In response to each call, 
as indicated in block 300-6, the ODBC server component 

20 20-12 stores the pertinent statistics and user/connection 
information as entries in the tables 200-1 through 200-3 
located in its memory area as indicated in FIG. 2. As 
discussed herein, it also stores information entries pertaining 
to the particular SQL query obtained from its tables in the 

25 ODBC SQL log 23. Such entries are recorded in two ASCII 
log files. The first log receives an entry for every user login 
to the ODBC server component 20-12, the entry is desig- 
nated as a user session record that contains a number of 
specified attributes. The second log file receives the SQL 

30 statement received by the ODBC server component 20-12, 
as well as a number of attributes such as information 
indicating uie user that issued ihe statement, the time, etc. 

In greater detail the ODBC SQL log files are formatted to 
contain the following information: 

35 Log 1: User__Session record (attributes-User_name, 
database-name, login-date, logout_date, Session ID); 
Log 2: User_Requests record (attributes-Session_ID, 
SQL__Text, Query-time, Return-Status, Return_Rows, 
Return-Bytes, Query-Type, Tuple_Size). 

40 Thus, as indicated in block 300-6, the ODBC server 
component 20-12 writes statistics and user information into 
the tables contained in its memory area and into the log files 
in accordance with the Log 1 and Log 2 formats. 
As indicated in FIG. 3, blocks 304-1 through 304-4 

45 indicate how the usage monitor component 20-6 performs its 
functions relative to creating entries in usage database 22. As 
indicated in block 304-1, for each end user session (end user 
connection), the ODBC server component 20-12 writes an 
entry into the SQL log 23 in the format described above. 

50 Subsequently, for each user that is logged on, for each SQL 
request, a log entry will be made to the SQL log 23 as 
indicated above. Since there are many users, these log 
entries will represent the random requests of many users. It 
is seen that the request information is only temporarily 

55 stored in the request table 200-3 contained in the ODBC 
server component's memory area. Once written to the log 
23, that entry in the SQL Request table 200-3 in the ODBC 
memory area is removed. The usage monitor component 
reads the ODBC SQL log 23 and writes data entries to a 

60 database record in a defined format as indicated in block 
304-2. 

The usage monitor component 20-6 then runs reports to 
analyze the usage database as indicated in block 304-4. 
Queries to summarize the entries by user, by time of day, by 
65 data warehouse table accessed, by query elapsed time, by 
row size returned, and by query type, are examples of reports 
that could be run. 
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Blocks 302-1 through 302-4 indicate how the SNMP APPENDIX I 
agent component 20-2 carries out its function of providing 

updated user information. As indicated in block 302-1, the GLOSSARY 
SNMP agent component 20-2 sets an internal timer function 

to periodically poll the ODBC server component 20-12 via 5 In thc ficld of thc present invention, the following terms 

the API library component 20-8. At each polling, the SNMP haye the following meanings: 
agent component 20-2 issues, for example, API call Get_ 

Active_Connections that accesses that API library routine. 1- API A set of routines used by an application program to 

This routine accesses the appropriate ODBC server memory direct the performance of procedures by the computer's 

table (ie., connections table 200-2). 10 operating system. 

The ODBC server component 20-12 transfers the 2. database management system (DBMS) A layer of soft- 
requested usage information received from its memory area W are between the physical database and the user. The 
to the SNMP agent component 20-2. That is, the ODBC DBMS manages all requests for database action (for 
server component 20-12 obtains and returns the requested example, queries or updates) from the user. This elimi- 
information to the SNMP agent component 20-2^ 15 ^ need for uger [Q ^ Qf ^ h ica] 

Next, as indicated in block 302-2, the SNMP agent , 4 „ e C1 , . , , r t , , . r J 

. -in -i . r »i_ • t r 4 *u details of file locations and formats, indexing schemes, 

component 20-2 transfers the usage information to the to 

SNMP server component's MIB database 23 utilizing the ctc - 

SNMP protocoL More specifically, the SNMP agent com- 3. SQL Originally an acronym for Structured Query Lan- 

ponent 20-2 issues a set command causing the information 20 guage. Now the name of the language most commonly 

item(s) to be stored in the preallocated object areas of the used to access relational databases. 

MNB database 23 designated by the set command SNMP 4. administrator an individual who carries out tasks such as 

server component 30 in response to the set command creating databases and/or monitoring the use and perfor- 

performs the required operations for storing the usage infor- mance of those databases. 

mation items. By way of example, these items could include 25 5 databasc A collection of data that has meaning to an 

objects defining the active connections active servers and organi2ation or to an individual and that is managed as a 

SQL requests issued by client end users that are formatted as . 

indicated in the Appendix. As indicated in block 302-4, the ™ l * T ™„ n T 

administration interface component 32 can be used to gather 6 - SNMP ™ e network management protocol of TCP/IP. In 

and report data from the MIB database 28. 30 SNMP, agents monitor the activity m the various devices 

From the above, it is seen how the API of the present on the network and report to the network console work- 
invention is able to provide interoperability between the station. Control information is maintained in a structure 
monitoring and administration components of a data ware- known as a management information base (MIB). 
house system utilizing an ODBC interface. 7. MIB A management information base comprises a set of 

It will be appreciated that the teachings of the present 35 objects describing software administrated by an SNMP 

invention may be used in conjunction with other types of man ager such as the Bull ISM system or HP Openview 

data warehouse systems. Further, the present invention may SV stem 

be used with other types of application tools and interoper- _ * " ~ , ™ ^ *■ •« 

, .,. ; L t j * u •* . 8. ODBC Open DataBase Connectivity specification pro- 

ability protocols such as Java database connectivity pro to- r ^ - . - c 

cols. Still further, the present invention may be incorporated 40 Vlded b V Microsoft Corporation that specifies an applica- 

into other types of data warehouse systems architectures. tion interface to heterogeneous databases. The specinca- 

Many other changes will immediately occur to those skilled tion is implemented by various DBMS specific drivers 

in the art. that map the ODBC specification to the DBMS interface, 

9. User A physical person or a unit in an enterprise. A user 

APPENDICES 45 nas a distinguished name, and is associated with "Authen- 

I. Glossary tication"attributes (e.g. password) and "privilege" 

II. MEB Objects and API attributes (e.g. role, category, classification, etc.). 



APPENDIX II 



A Data Warehouse MIB Objects 

This section of the Appendix contains example definitions of the objects defined in 
a section of the Data Warehouse MIB 28 supported by the Data Warehouse 
SNMP agent component 20-2. 
Active Connections 

NbActivcConnection OBJECT-TYPE 
SYNTAX Counter 
ACCESS read-only 
STATUS mandatory 
DESCRIPTION "Number of ODBC Server active connections" 
::- { ODBC 1 } 
ActiveCtionTkble OBJECT-TYPE 

SYNTAX SEQUENCE OF ActiveConnectionEntry 
ACCESS not-accessible 
STATUS mandatory 
DESCRIPTION "Table containing information for each active 
ODBC Server connection" 
::- { ODBC 2 } 
ActiveConnectionEntry OBJECT-TYPE 
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APPENDIX II-contimied 



SYNTAX ActivcConnectionEntry 
ACCESS not- accessible 
STATUS mandatory 
DESCRIPTION "Information on one ODBC Server active 
connection" 
INDEX { ActiveCtionlD }, 

{ ActiveCtionTable 1 } 
ActiveConnectionEntry :: - SEQUENCE { 
ActiveCtionlD 
Counter 

ActiveCtionType 
INTEGER 

ActiveCtio nLoginTime 
Displays tring 
ActiveaionNBMsgSent 
Counter 

ActiveCtionNBMs gR cvd 
Counter 

ActiveCtionB ytesRcvd 
Counter 

ActiveCtio nLastMsgType 

Displays tring 

AcdveCtionLastMsgDir 

INTEGER 

} 

ActiveCtionlD OBJECT-TYPE 

SYNTAX Counter 
ACCESS read-only 
STATUS mandatory 
DESCRIPTION "Unique Identifier for one ODBC Server 
active connection" 
::= { ActiveCtionEntry 1 } 
ActiveCtionType OBJECT-TYPE 

SYNTAX INTEGER{ 
server (1), 
client (2) 
} 

ACCESS read-only 
STATUS mandatory 
DESCRIPTION "Connection type" 

{ ActiveCtionEntry 2 } 
AcuveCtionLoginTime OBJECT-TYPE 
SYNTAX DisplaySlring 
ACCESS read-only 
STATUS mandatory 
DESCRIPTION "The log-in time for this connection" 
::- { ActiveCtionEntry 3 } 
ActiveQionNBMsgSent OBJECT-TYPE 
SYNTAX Counter 
ACCESS read-only 
STATUS mandatory 
DESCRIPTION "Number of messages sent since the log- in 
time of this connection" 
::- { ActiveCtionEntry 4 } 
ActiveCticnNBMsgRcvd OBJECT-TYPE 
SYNTAX Counter 
ACCESS read-only 
STATUS mandatory 
DESCRIPTION "Number of messages received since the 
log- in time of this connection" 
::- { ActiveCtio nEtnry 5 } 
ActionCtionBytesRcvd OBJECT-TYPE 
SYNTAX Counter 
ACCESS read-only 
STATUS mandatory 
DESCRIPTION "Number of bytes received since the log- in 
time of this connection" 
::= { ActiveCtionEntry 6 } 
ActiveCtionLastMsgType OBJECT-TYPE 
SYNTAX DisplaySlring 
ACCESS read-only 
STATUS mandatory 
DESCRIPTION "Type of the last message sent for this 
ODBC Server active connection" 
::- { ActiveCtionEntry 7 } 
ActiveCtionLastMsgDir OBJECT-TYPE 
SYNTAX INTEGER{ 
from (1), 
to (2) 
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APPENDIX H-contimied 



} 

ACCESS read-only 
STATUS mandatory 
DESCRIPTION "Direction of the last message sent for 
this active connection (from or to this 
ODBC Server)" 

{ ActiveCtionEntry 8 } 



The MIB section also includes object definitions for the first standard protocol used for data connectivity, the 

various items included in the servers table and SQL requests ODBC server component being operatively coupled to 

table of FIG. 2. the warehouse database through an ODBC driver com- 



B. Application Programming Interface (API) for Monitoring Data Warehouse 
Activity 

This section of the Appendix describes the API used by the SNMP Agent to access 
the ODBC Server Administration information. 

1) GET_SERVER_PARAMETERS( 

PROCESSED, 
USTEN_ADDRESS, 
SERVER_NAME, 
RETURN__CODE) 

2) GET_ACnVE_CONNECTIONS( 

SESSIONLID, 

CONNECTED, 

USER_ID, 

DATABASE_NAMB, 

LOGIN_DATE_TlME, 

LOGOUT_DATE_TIME f 

NUM_MESSAGES_SENT, 

NUM_MESSAGES_RECBTVED, 

NUM_BYTES_RECEIVED, 

LAST_MESSAGE, 

DIRECTION_OF_LAST_MESSAGE, 
RETURN_CODE) 

3) GET _ACTIVE_SQL_REQUESTS( 

SESSION_ID, 

SQL_TEXT, 

Q UERY_START„TIME , 

QUERY_STOP__TIM B, 

RETURN_STATUS, 

NU M_RO WS_RETU RNED, 

NUM_BYTES_RETURNED, 

QUERY_TYPE, 

TUPLE_SIZE, 

RETURNLCODE) 



One row is returned with each function call the function 
should be called repeatedly until return code EOF (end of 
file) is encountered. 

While in accordance with the provisions and statutes there 
has been illustrated and described the best form of the 
invention, certain changes may be made without departing 
from the spirit of the invention as set forth in the appended 
claims and that in some cases, certain features of the 
invention may be used to advantage without a corresponding 
use of other features. 

What is claimed is: 

1, A method for facilitating interoperability between com- 
ponents of a data warehouse system containing a warehouse 
database for storing warehouse information, the components 
including a number of different monitoring and administra- 
tion components for monitoring users and recording infor- 
mation relating to the activity of warehouse processes per- 
taining to accessing information stored in the warehouse 
database, the method comprising: 

(a) including in the warehouse system, an ODBC server 
component operatively coupled to a number of ODBC 
client systems for receiving SQL requests through a 



ponent for accessing information from the warehouse 
database using the first standard data connectivity pro- 
tocol; 

(b) including in the warehouse system, a storage log 
facility operatively coupled to the ODBC server com- 
ponent and to a predetermined one of the different 
warehouse components for enabling storage of infor- 
mation pertaining to user sessions and SQL queries by 
the OBDC server component for optimizing warehouse 
database storage and interfaces; and, 

(c) including in the warehouse system an API component 
as part of the ODBC server component that provides 
interoperability between the first standard protocol and 
other standard protocols for enabling the different 
warehouse monitoring and administration components 
to perform their functions pertaining to the warehouse 
database utilizing information received from the 
ODBC client systems and stored in the storage log 
facility. 

2. The method of claim 1 wherein the method further 
comprises the step of including in the number of different 
monitoring and administration warehouse components: 



55 



60 
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(d) a network agent component for performing the func- 13. The method of claim 2 wherein the usage monitoring 
tion of monitoring users and the activity of warehouse component performs the functions of reading the storage log 
processes utilizing one of the other standard protocols; facility and writing data records in a defined format and 
and, running reports for analyzing the usage database. 

(e) an usage monitoring component coupled to the net- 5 14 - Tn e method of claim 7 wherein the number of 
work agent component, the usage monitoring compo- different monitoring and administration warehouse compo- 
nent for performing the function of recording in a usage nents further includes an SNMP server component that 
database, information pertaining to user sessions and operatively couples to the network agent component corre- 
SQL queries issued by the individual users of the client sponding to an SNMP agent component and includes a MIB 
systems to access the warehouse database. 10 database for storing a number of information objects, the 

3. The method of claim 2 further including the step of method further including the step of periodically polling the 
populating the usage database with records having a pre- ODBC server component by the SNMW server component 
defined format by the usage monitoring component access- through the API component routines utilizing the one of the 
ing information from the storage log facility. st u ant ?afd protocol corresponding to an SNMP protocol for 

4. Hie method of claim 1 wherein step (b) further „ obtaining current usage information from toe table structures 

±- v / 15 0 f the allocated memory area of the ODBC server compo- 

m s: ■ i_ * . . - nent for transfer to a section of the MIB database allocated 

storing the information as entries having a predetermined for monitorillg data warehouse activity. 

format into a number of different log files. 15 ^ methoc ] Q f c i a j m 14 wherein the section is 

5. The method of claim 4 wherein the method further organized to contain objects being managed by the SNMP 
includes the steps of: 20 server component defining active connections, active servers 

storing entries in a first file of the number of different log and SQL requests issued by client end users, 

files corresponding to records identifying user sessions 16. The method of claim 15 wherein the method further 

and their attributes; and comprises the step of including an administration interface 

storing entries in a second file of the number of different in the SNMP server component for enabling an administra- 

log files corresponding to records identifying user SQL 25 tor to & athcr and re P ort warehouse data activity derived from 

statements and their attributes. the objects stored in the MIB database. 

6. Hie method of claim 5 wherein each record stored in ™ th f of f chu ? ] 6 where ' n ^J^l^S 
the first log file is formatted to include the following comprises the step of including enabling the starting and 
. Z . B TT r. . . t •„ a^T. stopping of the ODBC server component and for operating 
information: User_name; Database jame; ^ ^ di £ TC * ones of the warehouse components of the ware- 
Logout-date; Session_ID; and wherein each record ot the 30 jj 0Use s stem 

second log file is formatted to include the following infor- for provi , ing interoperability between com- 

matron: ponents of a data warehouse system containing a warehouse 

User Requests; Session_JD; SQL-text; Query-time; database for storing warehouse information, the components 

Return_status; Return_rows; Return__bytes; Query- including a number of different monitoring and administra- 

type and T\iple-size. 35 tj on components for monitoring users and recording infor- 

7. The method of claim 1 wherein the method further mation relating to the activity of warehouse processes per- 
comprises the step of including in the ODBC server taining to accessing information stored in the warehouse 
component, an allocated memory area for storing routines database, the facility comprising: 

included in the API component for maintaining interoper- (a) an ODBC server component operatively coupled to a 
ability between warehouse components and a number of 40 number of ODBC client systems for receiving SQL 
table structures for storing entries pertaining to tracking requests through a first standard protocol used for data 
servers operation, active end user database connections and connectivity, the ODBC server component being opera- 
end user SQL requests. lively coupled to the warehouse database through an 

8. The method of claim 7 wherein the routines of the API ODBC driver component for accessing information 
component includes a first routine for obtaining parameters 45 from the warehouse database using the first standard 
for the ODBC server component and warehouse server data connectivity protocol; 

components, a second routine for obtaining information (b) a storage log facihty operatively coupled to the ODBC 

pertaining to active end user connections and a third routine server component and to a predetermined one ot the 

for obtaining information pertaining to active SQL query warehouse components for enabling storage of infor- 

requests made by end users. 50 mation pertaining to user sessions and SQL queries by 

9. The method of claim 7 wherein the number of table the ODBC server component for optimizing warehouse 
structures includes a servers table, a connections table and database storage and interfaces; and, 

an SQL requests table. ( c ) an component included as part of the ODBC 

10. The method of claim 7 wherein the servers table server component that provides interoperability 
includes the following information sections: a process ID 55 between the first standard protocol and other standard 
section, a listen address section and a server name section. protocols for enabling the different warehouse moni- 

11. The method of claim 7 wherein the connections table toring and administration components to perform their 
includes the following information sections: session ID, functions relating to the warehouse database utilizing 
connect ID, user ID, database name, login date/time; logout information received from the ODBC client systems 
date/time, number of messages sent, number of messages 60 and stored in the storage log facility. 

received, number of bytes received, last message and the 19. The facility of claim 18 wherein the number of 

direction of the last message. different monitoring and administration warehouse compo- 

12. The method of claim 7 wherein the SQL requests table nents includes: 

includes the following information sections: session ID, (a) a network agent component for performing the func- 

SQL text, query start time, query stop time, return status, 65 tion of monitoring users and the activity of warehouse 

number of rows returned, number of bytes returned, query processes utilizing one of the other standard protocols; 

type and tuple size. and, 



02/07/2004, EAST Version: 1,4,1 



US 6,363391 Bl 

15 16 

(b) an usage monitoring component coupled to the net- 27. The facility of claim 26 wherein the servers table 

work agent component, the usage monitoring compo- includes the following information sections: a process ID 

nent recording in a usage database, information per- section, a listen address section and a server name section, 

taining to user sessions and SQL queries issued by the 28. The facility of claim 26 wherein the connections table 

individual users of the client systems to access the 5 includes the following information sections: session ID, 

warehouse database connect ID, user ID, database name, login date/time; logout 

20. The facility of claim 19 wherein the functions per- date/time, number of messages sent, number of messages 
formed by the usage monitor component operates to access "°e"? d - n ™ be ' of b y tcs rcccivcd > last messa S c and the 
information from the storage log facility to populate the direction ot the last mess age 

j . . ... , & , . *\ fi r / f 4 Hrt 29. The facility of claim 26 wherein the SQL requests 

usage oataDase witn recoras naving a preoennea rormai. 10 ^ ^ fol information 

sections: session 

21. The facility of claim 18 wherein the ODBC server , D SQLtext qu start lim B st0 ti return status 
component stores the information as entries having a pre- number of rows retumed> Qumber of b (es returQed( 
determined format into a number of different log files. tV p 6 ant j lU pj e 

22. The facility of claim 21 wherein the number of log 30 nc facility of claim 19 wherein the mon it 0 ring 
files includes: 15 component performs the functions of reading the storage log 

a first file for storing entries corresponding to records facility and writing data records in a defined format and 

identifying user sessions and their attributes; and running reports for analyzing the usage database. 

a second file for storing entries corresponding to records The facility of claim 24 wherein the number of 

identifying user SQL statements and their attributes. dlfferc f monitoring and admuustration warehouse compo- 

23. The facility of claim 22 wherein each record stored in 20 "ents further includes an SNMP server component that 
„ A . C1 . c . . j , . , , c „ . operatively couples to the network agent component corre- 

the first log file is formatted to include the following ^ * an ^ n * t and inc W saM EB 

information: User_name; Database_name; Login-date; for ^ a number of information objectSj the 

Logout-date; Session_ID; and wherein each record of the SNMp agent component beillg operative to periodically poll 

second log file is formatted to include the following infor- ^ thc 0DBC scrvcr component through the API component 

matlon: routines utilizing the one of the standard protocols corre- 

User_Requests; Session-ID; SQL text; Query-time; sponding to an SNMP protocol for obtaining current usage 

Return_status; Retum_rows; Re turn -bytes; Query- information from the table structures of the allocated 

type and Tuple_size. memory area of the ODBC server component for transfer to 

24. The facility of claim 18 wherein the ODBC server 3Q a section of the MEB database allocated for monitoring data 
component further includes an allocated memory area for warehouse activity. 

routines included in the API component for maintaining 32. The facility of claim 31 wherein ihe section of the 

interoperability between warehouse components and a num- NMB database is organized to contain objects being man- 

ber of table structures for storing entries pertaining to aged by the SNMP server component defining active 

tracking servers operation, active end user database connec- 35 connections, active servers and SQL requests issued by 

tions and end user SQL requests. client end users. 

25. The facility of claim 24 wherein the routines of the 33. The facility of claim 32 wherein the SNMP server 
API component includes a first routine for obtaining param- component further includes an administration interface for 
eters for the ODBC server component and warehouse server enabling an administrator t warehouse data activity derived 
components, a second routine for obtaining information ^ from the objects stored in the MEB database, 
pertaining to active end user connections and a third routine 34. The facility of claim 33 wherein the administration 
for obtaining information pertaining to active SQL query interf facilities for enabling the starting and stopping of the 
requests made by end users. ODBC server component and for operating different ones of 

26. The facility of claim 24 wherein the number of table the warehouse components of the warehouse system, 
structures includes a servers table, a connections table and 

an SQL requests table. ***** 
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